8

I'm trying to use the ECG data stream provided by a Nymi Band to calculate a users Heart Rate. My current approach is to obtain a 10 second sample of ECG data via the Nymi Bands ECG stream, check for the heart beats and multiply by 6 to get the BPM. By subtracting the previous value from the current value and storing it an a List, I get a pretty accurate graph of the ECG stream. The problem is that I am having difficulties accurately determining when a heartbeat actually occurs.

My guess is that I need to apply some form of filter first, to ensure "noise" does not negatively impact the readings. So here's my question: is there a cleaner and more accurate means of analysing the data for possible heart beats? Or how could I go about properly filtering the data to remove "noise"?


Edit 1 ( code and sample data):

-First Approach: I used a variation of Chauvenet's criterion to try and catch the outliers, which would be representative of the Heartbeat. However, the Standard Deviation was always too high, and the mean too low (almost always negative), to accurately detect which values were an outlier. With the sample data (below), the result is 22 beats in 10 seconds:

private List<Integer> parseDataForHB(List<Integer> ecgValues)
{
    double mean = mean(ecgValues);
    double standardDeviation = standardDeviation(ecgValues);
    Iterator it = ecgValues.iterator();

    List<Integer> heartBeatValues = new ArrayList<>();

    NormalDistribution normalDistribution = new NormalDistribution(mean, standardDeviation);
    while(it.hasNext())
    {
        int ecgVal = (Integer) it.next();
        stringBuilder.append(", " + ecgVal);

        if((normalDistribution.cumulativeProbability((double)ecgVal) * ecgValues.size()) < 0.5)
        {
            heartBeatValues.add(ecgVal);
        }
    }
    return heartBeatValues;
}

-Second Approach: Double pass through, finding the average heart beat value. First pass; use the max value of the entire data set, as a "starting average", then look for all values that are at least 1/2 of the max value, this data is used to create an average value for all the beats detected in the first pass. Second pass; iterate through all the values again looking for any values at least 50% of the new average value. This has proven to be more accurate than the first approach, but still falsely detects/discards heart beats. With the sample data (below), the result is 7 beats in 10 seconds:

private List<Integer> parseDataForHB(List<Integer> ecgValues, int averageHeartBeatValue)
{
    int previousVal = 0;
    List<Integer> heartBeatValues = new ArrayList<>();
    Iterator it = ecgValues.iterator();

    while(it.hasNext())
    {
        int ecgVal = (Integer)it.next();
        if(ecgVal >= (averageHeartBeatValue * .5))
        {
            if(((ecgVal > 0) && (previousVal < 0)) ||
                    ((ecgVal < 0) && (previousVal > 0)))
            {
                heartBeatValues.add(ecgVal);
                averageHeartBeatValue = (int) mean(heartBeatValues);
            }
        }
        previousVal = ecgVal;
    }
    return  heartBeatValues;
}

Sample data (when graphed, there are 10 visible spikes, which represent heartbeats):

-59752, -66222, -45702, -34272, -25891, -19203, -13547, -12212, -5916, -8793, -5083, -2075, 3231, 6295, 4898, 3029, 3427, 2161, 4274, -1209, 3428, -1793, 2560, 5195, 1092, 8088, 7539, 6673, 7338, 8527, 11586, 12264, 7979, 4316, 8383, 3198, 2555, 3574, 753, 2964, -3042, 901, -3218, -6178, -21116, 24346, -602, -1520, -3454, -1430, -7914, -1906, -6920, -8216, -8013, -6836, -7863, -1031, 3049, -271, -1010, 1562, -166, -1069, 1143, 3268, -1074, -258, -749, 433, -450, 2612, -2582, 1063, -2656, 3751, -1608, 637, -997, -7, 1155, -556, -1397, 2807, -967, 2946, 1198, -1133, -11066, 5439, 11159, -1066, 643, -34, 441, 1378, 1451, -1664, -2054, -2390, -1484, -1227, 5589, 5151, 4068, 3040, -2243, 1762, -2942, 51, 1793, 245, 171, 639, -375, 1296, -1327, 729, -624, -2642, 3964, -2641, 286, -2766, -393, -316, 2343, -3658, -552, 613, 2687, -1347, 539, -11251, 2873, 14529, -5234, -919, -2486, -3641, 4647, 0, -2149, -4063, -2619, -749, 18, 5274, 6670, 1413, 2697, 2673, 157, -180, 166, 2352, 454, 2013, -2867, 3788, -423, 1680, 1167, -1282, 1554, 768, 298, 205, -480, 2618, 531, -839, -1067, -1056, 1693, 3300, 52, -2087, 259, -5031, -4896, 15720, -3576, -3005, 849, -2643, 2204, -4461, -1953, -572, -3743, -3664, -2254, 3326, 7791, 2388, -1847, 2592, -1142, -1550, 1224, -1044, -1698, -481, 1469, -479, -125, -1853, 455, -38, 167, -55, -2126, -2291, 96, 1179, -2948, -1960, -876, 29, -2660, 1465, -1025, -2131, 2058, -3111, -19865, 20644, 1786, -2853, -2190, -2047, -1873, -643, -921, -3191, -3524, -5160, -3216, 2431, 7117, 1796, 2435, -516, 1557, -1248, -2745, -860, -618, -565, -93, 602, -3364, -1658, 1398, -126, -1715, -1685, 680, -1805, 232, -2093, -1703, -2844, -628, -2049, -1450, 1737, -1216, 2681, -2963, -4605, -11062, 15109, 133, -3804, -2971, -1867, -194, -1433, -4328, -2887, -4452, -3241, -1997, 1815, 6139, 1655, 1583, 520, -2574, -2458, 299, -2345, -475, 991, -2273, -1038, -154, 267, -1528, -1720, -440, -77, -1717, -28, -2684, -606, -1862, -560, -2120, -900, -4206, 2636, -8, -917, -1249, -3586, -13119, 8999, 6520, -2474, -3229, -1804, -1933, -1104, -3035, -1307, -3457, -4996, -2804, -2841, 3889, 6843, 1992, -671, 548, -1871, -2000, 1441, -1519, -2303, -1067, 1131, -1001, -1396, -289, -968, 1864, -3006, -1918, -72, -239, -589, -2233, -1982, 2608, -2765, -1461, -2215, -1916, 2924, -13, 342, -446, -3427, -19378, 20846, 2310, -6999, -1806, -728, -932, -2081, -2129, -2054, -4103, -2641, -4826, 1457, 3338, 6764, 2363, -1811, 453, -2577, -796, -237, -663, -1594, -170, -922, -149, -2258, -816, -1250, -1640, 2522, -4363, 668, -3494, -557, -21, -263, -4197, 694, -2921, -161, -3000, -852, 3120, 339, -1138, -2066, -4505, -13751, 17435, -446, -4212, -1339, -2239, -223, -1322, -3550, -3987, -2102, -3505, -3971, 3695, 3535, 3150, 2459, 1575, -3297, -383, -1470, 1556, -2191, -123, -1444, -1572, 1973, -3773, 1206, -860, -1384, -395, -818, -934, -940, -494, 795, -1416, -3613, -442, 622, -2798, 1296, -373, -400, -1270, 278, -5536, -14798, 20071, -2973, -3795, -754, -3358, -393, -2279, -1834, -1983, -5568, -4118, -2595, 1443, 6367, 3245, 1500, -1697, 1287

This data sample has more "noise", which I would ideally like to filter out:

-35751, -32565, -28033, -23493, -18135, -10310, -8731, -4143, -5485, -2162, -955, -6393, -4211, -3047, -3097, -3232, -2975, -1571, -2105, -1440, -3880, -372, -227, -1266, -2269, -299, 2255, -2534, -3677, 675, 78, 415, -2274, -2256, 875, -13756, -5896, 15991, 585, -4356, 2706, -2028, 2127, -2249, -1282, -2555, -2865, -2570, -2666, 3745, 5965, 2728, -73, 611, 342, 1297, 214, -1153, 496, -283, -1868, 1791, -541, 2044, -414, 1595, 72, -2262, -363, 1855, -649, 909, -815, -363, 2791, 152, 1072, -2025, 1291, -12311, -6729, 22739, -4036, -784, 2598, -871, -2182, 1244, -2158, -2403, -1551, -3825, -4385, 4281, 5919, 6609, -2120, 480, 1070, -736, 525, -1520, -2225, 1795, 574, 781, -584, -1750, 175, 3339, -1175, 1186, -1319, 361, 885, -46, -1078, -2569, -720, 1533, 2465, 113, -1953, 2475, -5732, -22272, 24177, 235, 1385, -3850, 2291, -1417, -2452, -862, -3745, -932, -3586, -3987, -69, 5431, 3902, 2284, -619, 609, -1424, -1467, -1055, -1166, -1216, 1515, -1851,  -49, -4983, 1495, 3563, -873, -1933, -397, -933, 546, -1925, -753, -53, -2603, -591, 769, 3005, -2773, 2097, -5993, -21911, 23700, 3747, -4986, 595, -1815, -1589, -571, -2116, -1823, -6708, -1686, -1891, -991, 5178, 3719, 1188, -2394, 3992, -1555, -5306, 2830, 25, -2564, 2112, -1723, -3810, 4700, -2780, 520, -70, -2015, 1093, -2231, 2526,  -4651, -799, 764, -2429, 272, -564, 1119, -1089, 2371, -5627, -8118, 7574, 6499, -8635, 582, -2186, -1986, -477, -2178, -707, -6743, -3582,  -4409, 1806, 2718, 5820, -272, 1046, -580, -1552, -1184, -3206, -690, 1218, -871, -1919, -2552, 2127, -754, -1848, -3573, 3112, -1170, 468, -2593, -382, -3280, 3664, -5572, 1992, -30, -7230, 8670, -2504, -4969, -14813, 225, 14109, 8194, -9438, -4781, 3102, -8626, 6428, -5387, -5050, 548, -10060, 6965, -2155, 2195, 5498, 359, -4090, 5130, -4214, 1478, -364, -6444, 5889, -3363, -1621, -3570, 8390, -5828, -1472, 841, -8869, 11057, -6734, 173, 535, -638, -2628, -2751, 4754, 514, -2423, 1168, -3860, -23875, 18070, 7511, -3048, -1173, -6033, 5087, -5258, -3012, -831, -1180, -5298, -557, -2993, 6236, 1417, 2683, 361, 2293, -4117, 1122, -1922, -3730, 2705, -848, -3560, 2100, -319, -495, -347, -2329, 1341, -805, 1227, -2463, -440, -1440, 1206, -2361, -411, -1481, 3837, -3101, 1851, -5779, -22183, 22335, 3443, -3854, -2077, -2311, 1471, -817, 792, -7227, -2963, -4038, -92, -1234, 4692, 3973, 2122, 1333, -222, -2997, 1279, -3531, 1335, 140, -375, -2235, 2795, 598, -3233, -951, 1895, -288, -925, 1066, -3400, -1230, -2011, 2217, 1942, -1790, -1700, -1450, 756, -10710, -6744, 18590, -1435, -1739, -2097, -2638, -454, 67, -4556, -695, -5602, -2815, -2142, 764, 5958, 2175, 2055, -647, -466, -478, -1082, 527, -2214, 275, 274, -1687, -2358, 31, 1570, -1587, -871, -271, -2365, 1337, -831, -1095, -2056, -208, -1383, 2415, -1523, -1538, -719, -3842, -20933, 15223, 9978, -4030, -2521, 190, -4163, -2305, 1814, -2465, -4207, -3792, -2559, -2123, 2908, 5366, 2933, -1455, -57, 112, -2241, -1416, -2778, 2353, -1200, -2027, -962, 1117, -1530, 157, -2902, 3466, -5072, 555, 1425, -2791, -1369, 156, -6789, 1961, -1111, 3631, -2592, -1643, 2039, -2865

Update 1 - Following @stackoverflowuser2010 recommendation, I gave a shot at using FFS to transform the ECG data into a Frequency Spectrum in order to calculate the peaks of the actual frequency. However, the results here weren't much better when passed through either Approach 1 (Chauvenet's Criterion) or Approach 2 (calculation based on the average heartbeat value). Maybe I'm missing something here? Here were the results using the same dataset:

TransformType.FORWARD: Approach 1 = 1, Approach 2 = 266

TransformType.INVERSE: Approach 1 = 1, Approach 2 = 0

I think part of the issue is that in order to use FFT, the data has to be a power of 2. As the data stream's size varies (recording for 10 seconds, a faster heartbeat will generate a larger dataset), I have to pad the end of the dataset if its size is not a power of 2.

Here's the new code, for the FFT functionality:

 private List<Integer> ffs(List<Integer> ecgValues)
{
    List<Integer> transoformedStream = new ArrayList<>();
    FastFourierTransformer ffs = new FastFourierTransformer(DftNormalization.STANDARD);
    double[] input = convertToDoubleArray(ecgValues);

    Complex[] complex = ffs.transform(input, TransformType.FORWARD);

    for(int i = 0; i < complex.length - 1; i++)
    {
        double real = (complex[i].getReal());
        double imaginary = (complex[i].getImaginary());

        transoformedStream.add((int)Math.sqrt((real * real) + (imaginary * imaginary)));
    }

    return transoformedStream;
}

private double[] convertToDoubleArray(List<Integer> ecgValues)
{
    double[] convertedList;

    if(isPowerOfTwo(ecgValues.size()))
    {
        convertedList = new double[ecgValues.size()];
    }
    else
    {
        convertedList = new double[nextPowerOfTwo(ecgValues.size())];
    }

    for(int i = 0; i < ecgValues.size(); i++)
    {
        convertedList[i] = (double)ecgValues.get(i);
    }
    return convertedList;
}

private boolean isPowerOfTwo(int size)
{
    boolean isPowerOfTwo = ((size & -size) == size);

    return isPowerOfTwo;
}

private int nextPowerOfTwo(int size)
{
    int res = 2;
    while (res <= size) {
        res  *= 2;
    }

    return res;
}

Slight modification to the while-loop in the code for Approach 2:

while(it.hasNext())
    {
        int ecgVal = (Integer)it.next();
        if(ecgVal >= (averageHeartBeatValue * .5))
        {
                heartBeatValues.add(ecgVal);
                averageHeartBeatValue = (int) mean(heartBeatValues);
        }
    }

Update 2 - Continuing to work with the FFT data, but still not certain if I'm on the right path here. Using the same method listed above for FFT (which uses "org.apache.commons.math3.transform.FastFourierTransformer"), I searched for the peak value in the FFT results. As this value was too high, I followed another approach I found, here you multiply the peak by the signal frequency (in this case, 50), and divide by the sample size. For the sample below, it calculates like so:

50hz * 423079 (peak) / 510 (sample size) = 41478.33

Alternatively:

50hz * 179 (index of the peak) / 510 (sample size) = 17.54

Here's the ECG values:

-70756.0, -56465.0, -52389.0, -25199.0, -20352.0, -13660.0, -12615.0, -9202.0, -10225.0, -6168.0, -5338.0, 4409.0, -1204.0, 3009.0, 1821.0, -3127.0, 2076.0, 720.0, 675.0, -880.0, 622.0, 1851.0, -915.0, 1296.0, -3069.0, -10.0, 1114.0, 2335.0, -4363.0, 3386.0, -189.0, -2497.0, 6326.0, -4007.0, -2708.0, 1120.0, -2159.0, 2643.0, -1817.0, 749.0, 6096.0, -2927.0, -1514.0, -24006.0, 18897.0, 10851.0, -2934.0, -1487.0, -1660.0, 90.0, 1999.0, -4448.0, 2567.0, -1185.0, -2172.0, -4479.0, -253.0, 5173.0, 5956.0, 2814.0, 3279.0, 1617.0, 5174.0, -4152.0, 911.0, 2404.0, 1579.0, 792.0, 573.0, -28.0, 3251.0, 159.0, -2170.0, 727.0, 2652.0, -2676.0, 3039.0, -2938.0, 2539.0, 1586.0, -1447.0, 132.0, -60.0, 439.0, -87.0, -2239.0, 2074.0, 1268.0, -3559.0, 1266.0, -18937.0, -869.0, 25032.0, -6298.0, -1653.0, 590.0, -1737.0, -3840.0, -484.0, -3408.0, -2470.0, -3663.0, -1526.0, -158.0, -748.0, 5249.0, -44.0, 1903.0, -1900.0, 2513.0, -58.0, -2065.0, -450.0, -1131.0, -2262.0, 3663.0, -2968.0, 1262.0, -1687.0, -2745.0, -581.0, -11.0, -528.0, 349.0, -2231.0, -1198.0, -2039.0, 1362.0, -3671.0, 580.0, -794.0, -3924.0, -1711.0, 2093.0, -935.0, 2423.0, -1017.0, -5674.0, -26830.0, 27284.0, 4433.0, -4604.0, -2655.0, -4541.0, -2643.0, 2036.0, -3159.0, -3194.0, -2030.0, -2535.0, -5753.0, -31.0, 5056.0, 241.0, 4452.0, -1591.0, -1056.0, 573.0, -3637.0, -1224.0, -2728.0, 3535.0, -2645.0, -1281.0, -1359.0, -1918.0, 621.0, -2967.0, 2535.0, -3048.0, -2820.0, -2530.0, -1202.0, 315.0, -645.0, -3541.0, -3547.0, -2725.0, -4590.0, -124.0, 620.0, -1866.0, -4450.0, -17536.0, 4480.0, 16119.0, -7421.0, 2363.0, -8373.0, 3109.0, -896.0, -6533.0, -1502.0, -378.0, -3602.0, -5893.0, -2730.0, 2619.0, 3532.0, 675.0, -778.0, -590.0, 288.0, -3793.0, -3934.0, -830.0, 564.0, -1103.0, -5270.0, 121.0, 950.0, -2570.0, -502.0, -1556.0, -142.0, -1683.0, -2455.0, -3154.0, -2773.0, -2883.0, -1375.0, -2866.0, -5988.0, 1914.0, -2311.0, -1654.0, -2757.0, -4321.0, -29329.0, 26384.0, 2636.0, -5619.0, -3352.0, -5555.0, -72.0, -5429.0, -751.0, -2445.0, -8749.0, -4021.0, -912.0, -2294.0, 6468.0, 135.0, 1281.0, -2321.0, -320.0, -2578.0, -3737.0, -1470.0, -1841.0, -631.0, -1108.0, -2371.0, -2055.0, -3166.0, -1419.0, -677.0, -3666.0, -881.0, -20.0, -4403.0, 1366.0, -3804.0, 1064.0, -10377.0, 4307.0, -3898.0, -845.0, 3795.0, -7509.0, -21636.0, 12672.0, 9857.0, -2862.0, -4136.0, -1805.0, -5989.0, 410.0, 1048.0, -13174.0, -949.0, -3802.0, -4939.0, 1437.0, -506.0, 1305.0, 6104.0, -1481.0, -3925.0, 1949.0, -1001.0, -4920.0, -172.0, -1043.0, -1158.0, -2925.0, -994.0, -2615.0, 720.0, -8393.0, 3785.0, -3428.0, -7614.0, 5963.0, -1540.0, -4688.0, -722.0, 881.0, -4912.0, 2058.0, -493.0, -7200.0, 4413.0, -34168.0, 29170.0, 1335.0, -4874.0, -13611.0, 8360.0, -4880.0, 1229.0, -4077.0, -7090.0, 4488.0, -8641.0, -3558.0, -2288.0, 3415.0, -1972.0, 4252.0, -578.0, -2509.0, -1106.0, -297.0, -3186.0, 1630.0, -5392.0, 261.0, -446.0, -12592.0, 10760.0, -3906.0, -3190.0, -2114.0, -1968.0, 880.0, 883.0, -3583.0, -4262.0, -4495.0, 505.0, 2194.0, -469.0, -5780.0, 5805.0, -11440.0, -21706.0, 27385.0, -8533.0, 2782.0, 362.0, -5929.0, -1915.0, -4238.0, 1071.0, -8529.0, 2317.0, -7595.0, -5143.0, 240.0, 6792.0, -2586.0, 5445.0, -2862.0, -3263.0, -4361.0, 3596.0, -3985.0, -438.0, -1449.0, -2594.0, 627.0, -3802.0, 1196.0, -2165.0, 319.0, -4753.0, -5308.0, 3199.0, -3945.0, -2982.0, 850.0, -1623.0, -2724.0, -828.0, -3097.0, -6728.0, 4599.0, 1662.0, -6493.0, 2834.0, -35656.0, 20133.0, 12750.0, -7834.0, -1832.0, 172.0, -11288.0, 13703.0, -12787.0, -6303.0, -2303.0, -2038.0, -7853.0, 8006.0, 707.0, -811.0, 3311.0, -2042.0, -1985.0, -423.0, -2754.0, 335.0, -5464.0, 600.0, -3398.0, -866.0, -1193.0, -2135.0, -2609.0, 1194.0, -2424.0, -2590.0, -3526.0, 790.0, -5170.0, 5491.0, 51.0, -14384.0, 9287.0, -4215.0, -7155.0, 9432.0, -12910.0, -1309.0, 5215.0, -3607.0, -6808.0, 9298.0, -22541.0, -12006.0, 28921.0, -9387.0, -1677.0, -656.0, -4015.0, -998.0, -1964.0, -5664.0, -4743.0, -3378.0, -9891.0, 6259.0, -585.0, 3174.0, -315.0, -507.0, -132.0, -463.0, -2709.0, -1921.0, -2463.0, -2316.0, 455.0, -2531.0

And here's the FFT values:

850159, 149286, 265943, 245545, 268816, 273358, 259215, 258683, 247526, 273654, 242403, 281878, 307284, 278415, 271214, 258875, 253768, 252473, 255385, 220324, 231414, 242633, 226099, 191531, 248391, 171515, 218672, 186567, 214938, 224413, 216581, 235749, 186375, 164166, 44581, 278924, 93980, 175930, 178638, 154459, 170033, 192662, 140531, 132274, 128717, 119741, 260519, 78757, 246641, 188627, 160756, 119053, 131311, 98181, 100447, 111493, 168179, 130609, 95353, 186940, 109973, 110107, 97234, 140556, 196081, 214005, 135410, 35912, 141008, 138413, 52177, 175686, 129286, 90057, 164437, 186183, 188454, 219768, 101066, 182511, 147675, 20046, 328759, 143892, 75628, 127744, 111484, 255969, 211560, 3946, 82988, 207029, 98322, 130963, 168633, 122201, 38624, 340126, 168085, 115223, 37400, 94940, 85540, 108631, 51006, 197575, 146065, 51800, 239245, 67848, 263602, 69630, 78250, 125533, 164151, 215253, 147920, 208686, 64569, 229339, 93518, 260792, 39166, 125931, 242542, 48721, 174348, 141559, 125815, 78765, 79803, 270542, 135343, 89293, 167074, 111937, 130130, 23251, 220470, 144755, 83364, 59643, 263924, 81461, 146219, 101076, 98141, 100952, 145975, 170965, 107258, 24782, 164298, 133108, 153683, 96266, 184367, 252932, 66484, 150744, 140932, 48479, 196921, 85676, 117759, 220018, 87578, 204263, 406546, 205701, 153631, 329187, 232988, 75216, 88677, 77744, 201402, 237572, 39696, 254693, 423076, 393125, 318252, 98043, 212493, 70255, 3664, 148288, 81766, 31081, 173588, 262050, 240517, 72926, 194867, 166347, 41535, 163457, 90379, 27538, 87297, 161587, 182472, 36915, 262205, 199485, 215211, 87933, 59445, 76130, 66797, 263300, 108378, 205190, 221071, 272146, 213902, 125151, 171001, 44875, 107620, 118709, 32582, 17918, 91632, 166583, 131732, 270558, 152837, 146896, 61740, 39048, 180589, 208806, 163988, 130691, 186421, 88166, 331794, 293086, 188767, 104598, 61049, 66532, 92698, 172981, 51492, 144210, 96422, 146135, 143004, 337824, 130458, 91313, 137682, 112294, 263795, 112294, 137682, 91313, 130458, 337824, 143004, 146135, 96422, 144210, 51492, 172981, 92698, 66532, 61049, 104598, 188767, 293086, 331794, 88166, 186421, 130691, 163988, 208806, 180589, 39048, 61740, 146896, 152837, 270558, 131732, 166583, 91632, 17918, 32582, 118709, 107620, 44875, 171001, 125151, 213902, 272146, 221071, 205190, 108378, 263300, 66797, 76130, 59445, 87933, 215211, 199485, 262205, 36915, 182472, 161587, 87297, 27538, 90379, 163457, 41535, 166347, 194867, 72926, 240517, 262050, 173588, 31081, 81766, 148288, 3664, 70255, 212493, 98043, 318252, 393125, 423076, 254693, 39696, 237572, 201402, 77744, 88677, 75216, 232988, 329187, 153631, 205701, 406546, 204263, 87578, 220018, 117759, 85676, 196921, 48479, 140932, 150744, 66484, 252932, 184367, 96266, 153683, 133108, 164298, 24782, 107258, 170965, 145975, 100952, 98141, 101076, 146219, 81461, 263924, 59643, 83364, 144755, 220470, 23251, 130130, 111937, 167074, 89293, 135343, 270542, 79803, 78765, 125815, 141559, 174348, 48721, 242542, 125931, 39166, 260792, 93518, 229339, 64569, 208686, 147920, 215253, 164151, 125533, 78250, 69630, 263602, 67848, 239245, 51800, 146065, 197575, 51006, 108631, 85540, 94940, 37400, 115223, 168085, 340126, 38624, 122201, 168633, 130963, 98322, 207029, 82988, 3946, 211560, 255969, 111484, 127744, 75628, 143892, 328759, 20046, 147675, 182511, 101066, 219768, 188454, 186183, 164437, 90057, 129286, 175686, 52177, 138413, 141008, 35912, 135410, 214005, 196081, 140556, 97234, 110107, 109973, 186940, 95353, 130609, 168179, 111493, 100447, 98181, 131311, 119053, 160756, 188627, 246641, 78757, 260519, 119741, 128717, 132274, 140531, 192662, 170033, 154459, 178638, 175930, 93980, 278924, 44581, 164166, 186375, 235749, 216581, 224413, 214938, 186567, 218672, 171515, 248391, 191531, 226099, 242633, 231414, 220324, 255385, 252473, 253768, 258875, 271214, 278415, 307284, 281878, 242403, 273654, 247526, 258683, 259215, 273358, 268816, 245545, 265943

These values are still pretty far off. On my other wrist, I have a separate wearable which tracks my heart rate, and for the given sample, it reported a rate of 77bpm.


Update 3 - Using Octive Online to test out properly running FFT (will test in Octive later). However, not sure I'm processing the data correctly. I'll continue playing with this to see if I can improve the results.

Here's the spectrum chart:

enter image description here

Here's my code:

Fs = 50;                    % Sampling frequency
T = 1/Fs;                     % Sample time
L = 476;                     % Length of signal
t = (0:L-1)*T;                % Time vector
% Sum of a 50 Hz sinusoid and a 120 Hz sinusoid
y = [ -70756 -56465 -52389 -25199 -20352 -13660 -12615 -9202 -10225 -6168 -5338 4409 -1204 3009 1821 -3127 2076 720 675 -880 622 1851 -915 1296 -3069 -10 1114 2335 -4363 3386 -189 -2497 6326 -4007 -2708 1120 -2159 2643 -1817 749 6096 -2927 -1514 -24006 18897 10851 -2934 -1487 -1660 90 1999 -4448 2567 -1185 -2172 -4479 -253 5173 5956 2814 3279 1617 5174 -4152 911 2404 1579 792 573 -28 3251 159 -2170 727 2652 -2676 3039 -2938 2539 1586 -1447 132 -60 439 -87 -2239 2074 1268 -3559 1266 -18937 -869 25032 -6298 -1653 590 -1737 -3840 -484 -3408 -2470 -3663 -1526 -158 -748 5249 -44 1903 -1900 2513 -58 -2065 -450 -1131 -2262 3663 -2968 1262 -1687 -2745 -581 -11 -528 349 -2231 -1198 -2039 1362 -3671 580 -794 -3924 -1711 2093 -935 2423 -1017 -5674 -26830 27284 4433 -4604 -2655 -4541 -2643 2036 -3159 -3194 -2030 -2535 -5753 -31 5056 241 4452 -1591 -1056 573 -3637 -1224 -2728 3535 -2645 -1281 -1359 -1918 621 -2967 2535 -3048 -2820 -2530 -1202 315 -645 -3541 -3547 -2725 -4590 -124 620 -1866 -4450 -17536 4480 16119 -7421 2363 -8373 3109 -896 -6533 -1502 -378 -3602 -5893 -2730 2619 3532 675 -778 -590 288 -3793 -3934 -830 564 -1103 -5270 121 950 -2570 -502 -1556 -142 -1683 -2455 -3154 -2773 -2883 -1375 -2866 -5988 1914 -2311 -1654 -2757 -4321 -29329 26384 2636 -5619 -3352 -5555 -72 -5429 -751 -2445 -8749 -4021 -912 -2294 6468 135 1281 -2321 -320 -2578 -3737 -1470 -1841 -631 -1108 -2371 -2055 -3166 -1419 -677 -3666 -881 -20 -4403 1366 -3804 1064 -10377 4307 -3898 -845 3795 -7509 -21636 12672 9857 -2862 -4136 -1805 -5989 410 1048 -13174 -949 -3802 -4939 1437 -506 1305 6104 -1481 -3925 1949 -1001 -4920 -172 -1043 -1158 -2925 -994 -2615 720 -8393 3785 -3428 -7614 5963 -1540 -4688 -722 881 -4912 2058 -493 -7200 4413 -34168 29170 1335 -4874 -13611 8360 -4880 1229 -4077 -7090 4488 -8641 -3558 -2288 3415 -1972 4252 -578 -2509 -1106 -297 -3186 1630 -5392 261 -446 -12592 10760 -3906 -3190 -2114 -1968 880 883 -3583 -4262 -4495 505 2194 -469 -5780 5805 -11440 -21706 27385 -8533 2782 362 -5929 -1915 -4238 1071 -8529 2317 -7595 -5143 240 6792 -2586 5445 -2862 -3263 -4361 3596 -3985 -438 -1449 -2594 627 -3802 1196 -2165 319 -4753 -5308 3199 -3945 -2982 850 -1623 -2724 -828 -3097 -6728 4599 1662 -6493 2834 -35656 20133 12750 -7834 -1832 172 -11288 13703 -12787 -6303 -2303 -2038 -7853 8006 707 -811 3311 -2042 -1985 -423 -2754 335 -5464 600 -3398 -866 -1193 -2135 -2609 1194 -2424 -2590 -3526 790 -5170 5491 51 -14384 9287 -4215 -7155 9432 -12910 -1309 5215 -3607 -6808 9298 -22541 -12006 28921 -9387 -1677 -656 -4015 -998 -1964 -5664 -4743 -3378 -9891 6259 -585 3174 -315 -507 -132 -463 -2709 -1921 -2463 -2316 455 -2531.0 ] % Sinusoids plus noise

NFFT = 2^nextpow2(L); % Next power of 2 from length of y
Y = fft(y,NFFT);
Pyy = Y.*conj(Y)/L;


plot(Pyy(1:238))
title('Power spectral density')
xlabel('Frequency (Hz)')

Update 4 - Decided to take a stab at a different approach. In this case, working with Auto-Correlation, low-pass filtering, and FFT.

First up Auto-Correlation: If there's minimal noise in the data, the results are pretty accurate. But, once noise occurs, the results are no longer reliable. Here's the code:

private float correlate(List<Float> data, int nElements, int offset)
{
    float sum = 0;

    for(int i = 0; i < nElements - offset; i++)
    {
        sum += data.get(i) * data.get(i + offset);
    }
    return sum;
}

int getBeat(List<Float> data, int n)
{
    int minEle = 0, maxEle, i;
    float minVal, maxVal;

    List<Float> correlatedValues = new ArrayList<>();

    for(i = 0; i < n; i++)
    {
        correlatedValues.add(correlate(data, n, i));
    }

    minVal = correlatedValues.get(0);

    for(i = 1; i < n; i++)
    {
        if(correlatedValues.get(i) > correlatedValues.get(i - 1))
        {
            minVal = correlatedValues.get(i);
            minEle = i;
            break;
        }
    }

    maxVal = minVal;
    maxEle = minEle;
    for (i=minEle; i<n; i++)
    {
        if (correlatedValues.get(i) > maxVal)
        {
            maxVal = correlatedValues.get(i);
            maxEle = i;
        }
    }

    return maxEle;
}

The result returned is the distance between beats. Dividing the sample length by the distance yields the heart rate for the sample. Example: 470 (Sample size) / 46 (distance) = 10 (beats per 10 second sample) * 6 = 60Bpm.

As mentioned, noise obscures this, so I tried to cobble together a low-pass filter based on this example. Here's the code I came up with:

private List<Float> lowPassFilter(List<Float> frequencies, float smoothing)
{
    float frequency = frequencies.get(0);
    for(int i  = 1; i < frequencies.size(); i++)
    {
        float currentFrequency = frequencies.get(i);
        frequency += (currentFrequency - frequency) / smoothing;
        frequencies.set(i, frequency);
    }
    return frequencies;
}

The problem is, that no matter what I run the results of the low-pass filter through (auto-correlation, Chauvenet's Criterion, or searching by peaks), the result is 0 (zero). My guess is that my implementation of the filter is off.

However, I've also tried using FFT to get the frequencies, then use these results with Auto-Correltation, and the outcome is still 0 (Zero). Here's the code for getting the frequencies with FFT:

    private List<Float> fft(List<Integer> ecgValues, TransformType transformType)
{
    int samplingFrequency = 50;

    List<Integer> transformedStream = new ArrayList<>();

    FastFourierTransformer ffs = new FastFourierTransformer(DftNormalization.STANDARD);
    double[] input = convertIntegerListToDoubleArray(ecgValues);

    Complex[] complex = ffs.transform(input, transformType);

    List<Float> magnitude = calculatePowerSpectrum(complex);

    List<Float> frequencies = powerSpectrumToFrequency(magnitude, samplingFrequency, ecgValues.size());

    return frequencies;
}

private List<Float> calculatePowerSpectrum(Complex[] complex)
{
    List<Float> magnitude = new ArrayList<>();

    for(int i = 0; i < complex.length - 1; i++)
    {
        double real = (complex[i].getReal());
        double imaginary = (complex[i].getImaginary());

        magnitude.add((float) Math.sqrt((real * real) + (imaginary * imaginary)));

    }

    return magnitude;
}
TheMoonbeam
  • 519
  • 1
  • 5
  • 15
  • 3
    This all sounds interesting, but you've provided us no code or data with which to assist. How can we suggest how you might filter the data to remove noise, when we don't have the data? – crush Jul 31 '15 at 18:51
  • Good catch; added requested info. – TheMoonbeam Aug 03 '15 at 16:47
  • This is an interesting question. Have you looked into running an FFT and looking for peaks in the frequency spectrum? – stackoverflowuser2010 Aug 03 '15 at 20:04
  • Thanks for the feedback, @stackoverflowuser2010. Updated the posting to reflect the usage of FFT. – TheMoonbeam Aug 04 '15 at 21:46
  • @TheMoonbeam: What do the A1 and A2 values mean? I have previously used FFT on other projects: after I computed the FFT, I found the frequency bin that contained the largest peak, which was relevant to the problem I was working on. In your case, presumably it would be similar. The frequency bin with the highest peak would indicate the heartbeat frequency. – stackoverflowuser2010 Aug 04 '15 at 22:19
  • @stackoverflowuser2010 A1 & A2 were shorthand for Approach 1and 2 (I've updated this in the post). I might be a little confused on your response, but it looks like in your case there was a single peak. Whereas in the data sample I'm getting, there are multiple peaks reflecting each heartbeat during the 10 second sample. Part of the issue is that the heartbeats aren't always the same value in a given sample (i.e. one beat might hit 11000, another might hit 19000), with noise hitting values close to (or in some cases above) beats with a lower value. – TheMoonbeam Aug 04 '15 at 23:14
  • @TheMoonbeam: I am referring to a peak in the frequency domain plot (magnitude vs frequency bin). Are you referring to a peak in the time domain plot (amplitude vs time)? – stackoverflowuser2010 Aug 04 '15 at 23:18
  • @stackoverflowuser2010, ah that makes a bit more sense. Could you elaborate a bit on how you've used this information in the past? That would help my understanding of how to properly utilize the FFT results. – TheMoonbeam Aug 05 '15 at 00:06
  • 1
    This problem is much more complicated than you suspect, You need to get in bed with signal analysis a bit. – Hot Licks Aug 05 '15 at 00:18
  • @TheMoonbeam, long shot here: I am trying to get the ECG from the Nymi band but it appears that is not possible since SDK v2. Would you perhaps have the SDK v1 and the code you used to share? – rsc Mar 31 '19 at 22:21

3 Answers3

3

First of all, fun question. Absolutely love it.

A heartbeat is characterized by a drop in pressure followed by a large increase in pressure, followed by a large drop then back to average.

Noise is more random than this and tends to return to average before dropping (usually).

By comparing a moving noise average to the maximum change over 3 points, we can filter out actual heartbeats from noise. You can see this in the JSfiddle below:

Fiddle

Yes, I made the display circular because I was originally just plotting it for fun. It looks cool when you make the lines fade. Also, I'm aware this isn't written in java, but the code is basically the same.

Anyhow, the relevant code is this:

var averageSpike=0;
//itterate over data
for (var i = 0; i < data.length; i++) {
  //Calc moving average
  for (var l = 0; l < 10; l++) {
    var m = i - l;
    if (m < 0)
      m += data.length;
    if (m > data.length)
      m -= data.length;
    averageSpike += Math.abs(data[m]);
  }
  //4 times average is the threshhold for a heartbeat. This may require tweaking
  averageSpike /= 2.5;
  //Get 3 points ahead
  j = i + 1;
  k = i + 2;
  //wrap around array
  if (j > data.length - 1) {
    j = 0;
  }
  if (k > data.length - 1) {
    k = k - data.length;
  }
  var p1 = data[i];
  var p2 = data[j];
  var p3 = data[k];
  //Get min and max points
  //Notice that the min can only come from points 1 and 2, and the max from
  //  2 and 3. This is important as it filters out false positives.
  var min = Math.min(p1, p2);
  var max = Math.max(p2, p3);
  //Calc the difference
  var dif = max - min;
  //check if it is greater than the noise threshold
  if (dif >= averageSpike) {
    data2.push(dif);
  } else {
    data2.push(0);
  }
}

I haven't tested with different noise thresholds.

Obviously now that you have your single spikes you simply have to record them and take a moving average of how many there are (in the given time period) to calculate bpm.

EDIT:

I've been running some tests on both data sets. By tweaking the number of points in the moving average and the divisor very slightly they can both be 100% accurate. But not at the same time. On the low noise data set, if the noise gets too low, false positives occur. This could be solved by limiting the lower bound of the noise threshold. Ideally with an equation with an asymptote at y=1 that then becomes linear... But I haven't found the right equation yet.

Issues will also arise as the bpm changes. The number of 'noise' data points will decrease as bpm rises, and thus the number of points in the moving average will need to change. This can be remedied with a simple feedback mechanism that modifies the loop count and divisor based on the current bpm.

Zwander
  • 382
  • 2
  • 12
2

First of all, let's plot the two data sets you have. Maybe you should have done that in the first place.

series 1

series 2

If you want to find the heart rate, you can probably determine the result in either the time domain or the frequency domain.

To find the heart rate in the time domain, you will need to find the peaks in the data. Your data is fairly clean, so you can potentially make do with a simple peak-finding algorithm. A search for "time series find peak" leads to the following Stackoverflow question: Peak signal detection in realtime timeseries data

That post provides several answers that you can probably hack together in a day.

As you mentioned in your original posting, there are about 10 peaks in the 10 second sample, so in 60 seconds, the heart rate would be around 60 beats per minute.

To find the heart rate in the frequency domain, you can run an FFT. To run the FFT properly and find the frequency bins, you will need to provide a sampling frequency. I am guessing that since you have 500 samples over 10 seconds, the sampling rate must have been 500 samples/10 seconds = 50 Hz.

I don't have a working Matlab or Octave installed on this computer, but you can run it yourself. For example, Mathworks has a page that shows all the code you need to run FFT and plot the results: http://www.mathworks.com/help/matlab/ref/fft.html?refresh=true

The FFT plot from that page (NOT from your data) is below:

enter image description here

In the plot above, you can see a tallest peak at 125Hz. If you use your data, the tallest singular peak will be your answer.

You obviously will not be running Matlab for your software. However, there are plenty of open-source FFT libraries available. Once the FFT is finished, you need to parse the answer to find the highest peak.

Whatever what you use to get your answer, you will need to compare it against some ground truth. I suggest using a heart-rate app on your smartphone (iPhone or Android). One heart-rate app I use is Instant Heart Rate by Azumio. This Stackoverflow question has some background on these types of apps: https://apple.stackexchange.com/questions/45176/how-accurate-are-ios-apps-that-measure-heart-rate

If you need more answers, I suggest you add the "signal-processing" tag to your question so that it's visible to folks with DSP knowledge. There is also another StackExchange board (https://dsp.stackexchange.com/) that has more experts.

When you find your solution, please post back here with your result.

Edit, 8/5/2015

Background information on FFT:

The FFT is an algorithm to find the frequency components of a time domain signal. Actually, the discrete Fourier transform (DFT) can do this for you. The FFT is just a fast implementation of a DFT (thus, Fast Fourier Transform). A recurring frequency in your time domain plot would become much more evident in the frequency domain. For example, your time domain plot shows 10 strongly recurrent peak every 10 seconds. The frequency domain plot would manifest the same data with one large peak at 10 peaks / 10 seconds = 1/second = 1Hz.

Here are some links to help you understand what FFT does. I recommend you install Matlab or Octave (the free open-source version of Matlab).

http://www.mathworks.com/help/matlab/examples/fft-for-spectral-analysis.html

http://www.dspguide.com/ch9/1.htm

This link in particular shows Matlab code to read a time series signal from a CSV file and then make a frequency spectrum plot:

http://www.mathworks.com/matlabcentral/answers/155036-how-to-plot-fft-of-time-domain-data

You must provide a sampling frequency (Fs is the common name for this variable).

Community
  • 1
  • 1
stackoverflowuser2010
  • 38,621
  • 48
  • 169
  • 217
  • thanks for the info. As was noted in the original post, I have been plotting the data in a graph. However, I did not provide images of the graph for the corresponding data (mistake on my part). moving on to your answer, I'm still a bit confused here. Not sure how the peak in the FFT results would be reflective of the actual heart rate. See "edit 4" for updated details. – TheMoonbeam Aug 05 '15 at 20:48
  • @TheMoonbeam: Updated my answer. – stackoverflowuser2010 Aug 05 '15 at 21:27
  • stackoverflowuser2010, thanks for the links. Just for clarification, when you said "The frequency domain plot would manifest the same data with one large peak at 10 peaks / 10 seconds = 1/second = 1Hz", do you mean that the single large peak will be reflective of all the heartbeats, and that each Hz of the peak is equal to a single heart beat? Also, added Edit 5, with what I've got so far using Octive Online to run the FFT. – TheMoonbeam Aug 06 '15 at 00:40
  • @TheMoonbeam: The content of the frequency domain plot will reflect whatever is in the time domain data that is provided to the FFT function. Did you try the EXACT example in http://www.mathworks.com/help/matlab/examples/fft-for-spectral-analysis.html to see if it produces the result? Also, it would be great if you could at least upvote my answer. – stackoverflowuser2010 Aug 06 '15 at 00:59
  • stackoverflowuser2010, good catch. I had tested a few other demo's from MatLab, but not the same FFT demo you linked to. Just tested and not only was it different, but the results in Octave Online, Octave (the actual application) and FreeMat are noticeable different, with none of them quite lining up to that of MatLab. Looks like I may have to run the tests in MatLab rather than Octave (at least for now). – TheMoonbeam Aug 06 '15 at 05:14
  • stackoverflowuser2010, based on the MatLab documentation, it looks like the peak(s) represent the frequency(s) of the original signal. If that's the case, I'm not 100% certain how that would help determine how many heartbeats occurred during a given sample. My only thought would be to try and apply high and low-pass filters, using frequencies converted from the peak value(s). Does this sound about right? – TheMoonbeam Aug 06 '15 at 23:58
  • @TheMoonbeam: If your heartbeats are periodic, they will show up in the frequency spectrum plot (power vs frequency) with the highest y-axis value, I think. – stackoverflowuser2010 Aug 07 '15 at 02:32
  • @TheMoonbeam: If the FFT is too complex, you should explore whether or not the correct answer can be obtained with only the time-domain data. I link that I gave earlier (http://stackoverflow.com/questions/22583391/peak-recognition-in-realtime-timeseries-data) seemed to have really good information. – stackoverflowuser2010 Aug 07 '15 at 02:34
  • I've put the "pure FFT" implementation on the back burner for a bit, as I try out some variations with auto-correlation. Sorry, forgot to mention earlier that I was also running some tests based on the other posts you provided. These were similar to one of my earlier approaches, and again work quite well accept when there's significant noise. I'm working on filtering the noise, then I'll see which approach works the best and most efficiently. – TheMoonbeam Aug 07 '15 at 18:00
2

Nice question. Here is yet another way to solve the problem:

You can detect periodic elements within a signal by doing an auto-correlation. In short, an auto-correlation can be calculated by multiplying a signal with a time-shifted version of itself and store the sum of products. Do this for all possible time-shifts and you'll get the auto-correlation.

Each element in the auto-correlation tells you how similar a signal is to itself at different time-shifts. If there is something periodic in the signal (like your heart-beat) you'll get a peak in the correlation.

Here is the auto-correlation of your first and second data-set (truncated to the first 200 elements):

Auto-correlation of first data-set Auto-correlation of first data-set

Note that all auto-correlations start with a trivial, huge peak for the first elements. That's because the signal correlated with the not time-shifted version of itself correlates perfectly. This peak quickly drops. Later on you'll find peaks that represent your heart-beat, twice the heart-beat, three times your heart-beat and so on.

The task now is quite simple: Calculate the auto-correlation of a chunk of data, skip the initial peak and search for the highest peak. That'll be placed where the signal is most periodic. E.g. where your heart-beat is located.

Here is a C-Code that does this in a brute-force way (sorry, no java):

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

static float timeseries1[] =
{
-59752, -66222, -45702, -34272, -25891, -19203, -13547, -12212, -5916, -8793, -5083, -2075, 3231, 6295, 4898, 3029, 3427, 2161, 4274, -1209, 3428, -1793, 2560, 5195, 1092, 8088, 7539, 6673, 7338, 8527, 11586, 12264, 7979, 4316, 8383, 3198, 2555, 3574, 753, 2964, -3042, 901, -3218, -6178, -21116, 24346, -602, -1520, -3454, -1430, -7914, -1906, -6920, -8216, -8013, -6836, -7863, -1031, 3049, -271, -1010, 1562, -166, -1069, 1143, 3268, -1074, -258, -749, 433, -450, 2612, -2582, 1063, -2656, 3751, -1608, 637, -997, -7, 1155, -556, -1397, 2807, -967, 2946, 1198, -1133, -11066, 5439, 11159, -1066, 643, -34, 441, 1378, 1451, -1664, -2054, -2390, -1484, -1227, 5589, 5151, 4068, 3040, -2243, 1762, -2942, 51, 1793, 245, 171, 639, -375, 1296, -1327, 729, -624, -2642, 3964, -2641, 286, -2766, -393, -316, 2343, -3658, -552, 613, 2687, -1347, 539, -11251, 2873, 14529, -5234, -919, -2486, -3641, 4647, 0, -2149, -4063, -2619, -749, 18, 5274, 6670, 1413, 2697, 2673, 157, -180, 166, 2352, 454, 2013, -2867, 3788, -423, 1680, 1167, -1282, 1554, 768, 298, 205, -480, 2618, 531, -839, -1067, -1056, 1693, 3300, 52, -2087, 259, -5031, -4896, 15720, -3576, -3005, 849, -2643, 2204, -4461, -1953, -572, -3743, -3664, -2254, 3326, 7791, 2388, -1847, 2592, -1142, -1550, 1224, -1044, -1698, -481, 1469, -479, -125, -1853, 455, -38, 167, -55, -2126, -2291, 96, 1179, -2948, -1960, -876, 29, -2660, 1465, -1025, -2131, 2058, -3111, -19865, 20644, 1786, -2853, -2190, -2047, -1873, -643, -921, -3191, -3524, -5160, -3216, 2431, 7117, 1796, 2435, -516, 1557, -1248, -2745, -860, -618, -565, -93, 602, -3364, -1658, 1398, -126, -1715, -1685, 680, -1805, 232, -2093, -1703, -2844, -628, -2049, -1450, 1737, -1216, 2681, -2963, -4605, -11062, 15109, 133, -3804, -2971, -1867, -194, -1433, -4328, -2887, -4452, -3241, -1997, 1815, 6139, 1655, 1583, 520, -2574, -2458, 299, -2345, -475, 991, -2273, -1038, -154, 267, -1528, -1720, -440, -77, -1717, -28, -2684, -606, -1862, -560, -2120, -900, -4206, 2636, -8, -917, -1249, -3586, -13119, 8999, 6520, -2474, -3229, -1804, -1933, -1104, -3035, -1307, -3457, -4996, -2804, -2841, 3889, 6843, 1992, -671, 548, -1871, -2000, 1441, -1519, -2303, -1067, 1131, -1001, -1396, -289, -968, 1864, -3006, -1918, -72, -239, -589, -2233, -1982, 2608, -2765, -1461, -2215, -1916, 2924, -13, 342, -446, -3427, -19378, 20846, 2310, -6999, -1806, -728, -932, -2081, -2129, -2054, -4103, -2641, -4826, 1457, 3338, 6764, 2363, -1811, 453, -2577, -796, -237, -663, -1594, -170, -922, -149, -2258, -816, -1250, -1640, 2522, -4363, 668, -3494, -557, -21, -263, -4197, 694, -2921, -161, -3000, -852, 3120, 339, -1138, -2066, -4505, -13751, 17435, -446, -4212, -1339, -2239, -223, -1322, -3550, -3987, -2102, -3505, -3971, 3695, 3535, 3150, 2459, 1575, -3297, -383, -1470, 1556, -2191, -123, -1444, -1572, 1973, -3773, 1206, -860, -1384, -395, -818, -934, -940, -494, 795, -1416, -3613, -442, 622, -2798, 1296, -373, -400, -1270, 278, -5536, -14798, 20071, -2973, -3795, -754, -3358, -393, -2279, -1834, -1983, -5568, -4118, -2595, 1443, 6367, 3245, 1500, -1697, 1287
};


static float timeseries2[] =
{
-35751, -32565, -28033, -23493, -18135, -10310, -8731, -4143, -5485, -2162, -955, -6393, -4211, -3047, -3097, -3232, -2975, -1571, -2105, -1440, -3880, -372, -227, -1266, -2269, -299, 2255, -2534, -3677, 675, 78, 415, -2274, -2256, 875, -13756, -5896, 15991, 585, -4356, 2706,
-2028, 2127, -2249, -1282, -2555, -2865, -2570, -2666, 3745, 5965, 2728, -73, 611, 342, 1297, 214, -1153, 496, -283, -1868, 1791, -541, 2044, -414, 1595, 72, -2262, -363, 1855, -649, 909, -815, -363, 2791, 152, 1072, -2025, 1291, -12311, -6729, 22739, -4036, -784, 2598, -871,
-2182, 1244, -2158, -2403, -1551, -3825, -4385, 4281, 5919, 6609, -2120, 480, 1070, -736, 525, -1520, -2225, 1795, 574, 781, -584, -1750, 175, 3339, -1175, 1186, -1319, 361, 885, -46, -1078, -2569, -720, 1533, 2465, 113, -1953, 2475, -5732, -22272, 24177, 235, 1385, -3850, 2291, -1417, -2452, -862, -3745, -932, -3586, -3987, -69, 5431, 3902, 2284, -619, 609, -1424, -1467, -1055, -1166, -1216, 1515, -1851,
-49, -4983, 1495, 3563, -873, -1933, -397, -933, 546, -1925, -753, -53, -2603, -591, 769, 3005, -2773, 2097, -5993, -21911, 23700, 3747, -4986, 595, -1815, -1589, -571, -2116, -1823, -6708, -1686, -1891, -991, 5178, 3719, 1188, -2394, 3992, -1555, -5306, 2830, 25, -2564, 2112, -1723, -3810, 4700, -2780, 520, -70, -2015, 1093, -2231, 2526,
-4651, -799, 764, -2429, 272, -564, 1119, -1089, 2371, -5627, -8118, 7574, 6499, -8635, 582, -2186, -1986, -477, -2178, -707, -6743, -3582,
-4409, 1806, 2718, 5820, -272, 1046, -580, -1552, -1184, -3206, -690, 1218, -871, -1919, -2552, 2127, -754, -1848, -3573, 3112, -1170, 468,
-2593, -382, -3280, 3664, -5572, 1992, -30, -7230, 8670, -2504, -4969, -14813, 225, 14109, 8194, -9438, -4781, 3102, -8626, 6428, -5387, -5050, 548, -10060, 6965, -2155, 2195, 5498, 359, -4090, 5130, -4214, 1478, -364, -6444, 5889, -3363, -1621, -3570, 8390, -5828, -1472, 841,
-8869, 11057, -6734, 173, 535, -638, -2628, -2751, 4754, 514, -2423, 1168, -3860, -23875, 18070, 7511, -3048, -1173, -6033, 5087, -5258,
-3012, -831, -1180, -5298, -557, -2993, 6236, 1417, 2683, 361, 2293, -4117, 1122, -1922, -3730, 2705, -848, -3560, 2100, -319, -495, -347, -2329, 1341, -805, 1227, -2463, -440, -1440, 1206, -2361, -411, -1481, 3837, -3101, 1851, -5779, -22183, 22335, 3443, -3854, -2077, -2311, 1471, -817, 792, -7227, -2963, -4038, -92, -1234, 4692, 3973, 2122, 1333, -222, -2997, 1279, -3531, 1335, 140, -375, -2235, 2795, 598,
-3233, -951, 1895, -288, -925, 1066, -3400, -1230, -2011, 2217, 1942, -1790, -1700, -1450, 756, -10710, -6744, 18590, -1435, -1739, -2097, -2638, -454, 67, -4556, -695, -5602, -2815, -2142, 764, 5958, 2175, 2055, -647, -466, -478, -1082, 527, -2214, 275, 274, -1687, -2358, 31, 1570, -1587, -871, -271, -2365, 1337, -831, -1095, -2056, -208, -1383, 2415, -1523, -1538, -719, -3842, -20933, 15223, 9978, -4030, -2521, 190, -4163, -2305, 1814, -2465, -4207, -3792, -2559, -2123, 2908, 5366, 2933, -1455, -57, 112, -2241, -1416, -2778, 2353, -1200, -2027,
-962, 1117, -1530, 157, -2902, 3466, -5072, 555, 1425, -2791, -1369, 156, -6789, 1961, -1111, 3631, -2592, -1643, 2039, -2865
};


float correlate (float * data, int nElements, int offset)
/////////////////////////////////////////////////////////
{
  float summ = 0;
  int i;

  for (i=0; i<nElements - offset; i++)
    summ += data[i] * data[(i+offset)];

  return summ;
}


int getBeat (float * data, int n)
/////////////////////////////////
{
  float * c = (float *) malloc (n * sizeof (float));

  int    minEle, maxEle, i;
  float  minVal, maxVal;

  // calculate the time-delayed correlation of the signal with itself:
  for (i=0; i<n; i++)
    c[i] = correlate (data, n, i);

  // Heuristic: Search for the first element that is higher than
  // it's precursor: (this is an heuristic to skip the trivial
  // correlation of the signal with itself).
  minVal = c[0];
  for (i=1; i<n; i++)
  {
    if (c[i] > c[i-1])
    {
      minVal = c[i];
      minEle = i;
      break;
    }
  }

  // Now just search for the highest peak. That's
  // where the highest periodicity in the signal is
  // located:
  maxVal = minVal;
  maxEle = minEle;
  for (i=minEle; i<n; i++)
  {
    if (c[i] > maxVal)
    {
      maxVal = c[i];
      maxEle = i;
    }
  }
  free (c);

  return maxEle;
}


int main (int argc, char **args)
{
  int nElements1 = sizeof (timeseries1) / sizeof (float);
  int nElements2 = sizeof (timeseries2) / sizeof (float);

  printf ("beat distance is %d samples\n",
    getBeat (timeseries1, nElements1));

  printf ("beat distance is %d samples\n",
    getBeat (timeseries2, nElements2));
  return 1;
}

Solutions found are:

beat distance is 46 samples
beat distance is 45 samples

I use a simple heuristic to skip the first indices by searching from left to right for the first element that has higher correlation than it's precursor. That often works well in practice. However, if you have a highest frequency of interest you can directly calculate how many initial correlations to ignore. The same applies to the lowest frequency of interest.

The auto-correlation itself can be calculated much faster using FFT and would also work with non power of twos by zero padding (I may add this later), but for the demonstration the brute force method is probably fine.

The problem with the auto-correlation approach should also be named: It is possible that two or more heart-beats correlate better than a single heart-beat. In this case you'll get half the beat frequency or twice the period. If you do constant measurement and detect that the frequency drops by an integer factor from one measurement to another you should not look for the absolute maximum in the correlation but search for a local peak near the expected frequency.

Note that I haven't done any filtering on the data. You can likely improve the results by applying a window-function and by removing noise with some digital filters.

Why the pure FFT solution is likely to fail:

You've done some experiments using the FFT of the signal and look for peaks, and you came up with not so great results. That's because the FFT is converting your time domain signal into the sine-wave components. Do your heart-beats look like sine-waves? I don't think so. They are peaks and contain among the fundamental frequency a lot of high frequency components. In fact most of the energy of your heart-beats is in the high frequency bands. That's why you find peaks waaay up in the spectrum.

Since the fundamental frequency of the beats is what you're looking for the data you got is not suitable for direct frequency domain analysis. Beside the auto-correlation you may want to take a look at the Cepstrum. This is a FFT related transformation that deals with highly harmonic signals much better.

Nils Pipenbrinck
  • 83,631
  • 31
  • 151
  • 221
  • thanks for the suggestions. The Auto-Correlation works well on smooth data. But, as you noted, significant noise drastically diminishes the results, causing the distance to be too low (i.e. 7 or 10). Working on two other approaches based on your recomendations; 1 - using auto-correlation of the FFT results (once converted to the frequencies). 2 - applying filters to the ECG data, then passing it through auto-correlation. – TheMoonbeam Aug 07 '15 at 17:57
  • @TheMoonbeam Yea, auto-correlation is not a silver bullet, just one of many tools to analyze data. – Nils Pipenbrinck Aug 07 '15 at 18:17
  • agreed. Though it's a particularly useful one (if the data is smooth enough). Added another update reflecting the direction I'm currently going in. – TheMoonbeam Aug 07 '15 at 19:18