Load MIT-BIH Arrhythmia ECG database onto MATLAB

Question

I am working on ECG signal processing using neural network which involves pattern recognition. As I need to collect all the data from Matlab to use it as test signal, I am finding it difficult to load it on to the Matlab. I am using MIT Arrhythmia database here.

The signal needs to be indexed and stored as data structure in Matlab compatible format. Presently, the signal is in .atr and .dat format.

How can you load MIT-BIH Arrhythmia database onto Matlab?

I removed you email address. It's better to put such info into your profile page rather than in your post as plain text, even if Gmail spam filtering does quite a good job. — chl, Jun 09 '11 at 05:19
I don't know what exactly you want to detect on the ECG signal but in my opinion the MIT-BIH database has a poor quality and I'd probably recommend you to find a different one. From historical reasons it is some kind of academic standard, but if you don't need to compare your results with previous publications I'd use a different one. For example PTB is a good one (2 minute strips of 500 different patients, 12 lead ECG) — Biggles, Jun 15 '11 at 12:45
@Polda How can you describe PTB's quality? Can you confirm it to be AAMI standard? MIT-BIH is AAMI standard in ambulatory setting. — Léo Léopold Hertz 준영, May 22 '15 at 05:24

Rashid · Accepted Answer · 2015-05-22T05:16:16.640

6

You can use physionet ATM to get .mat files which are easier to work with.

In the input part select the desired leads, length, database and sample.

In the toolbox select export as .mat:

enter image description here

Then download the '.mat' file,

enter image description here

In order to open the file in MATLAB, here is a sample code:

load ('100m.mat')          % the signal will be loaded to "val" matrix
val = (val - 1024)/200;    % you have to remove "base" and "gain"
ECGsignal = val(1,1:1000); % select the lead (Lead I)
Fs = 360;                  % sampling frequecy
t = (0:length(ECGsignal)-1)/Fs;  % time
plot(t,ECGsignal)

and you will get,

enter image description here

However, If you were to read annotation files for arrhythmia or QRS complexes that would be another problem.

Edit

The base and gain come from the info file (second picture). This file gives you various information regarding the ECG signal.

enter image description here

In the last sentence it says: To convert from raw units to the physical units shown above, subtract 'base' and divide by 'gain'.

edited May 22 '15 at 05:16

answered Jul 31 '14 at 14:00

Rashid

4,326
2
29
54

Can you please explain your "base" and "gain" point more thoroughly. I added my answer below where I show the difference between your and the method where you do not those conversions. – Léo Léopold Hertz 준영 May 18 '15 at 23:26
Hi @Masi, thanks for your attention, as you want to download the `.mat` file, there is another file named `.info` in which it indicates: _To convert from raw units to the physical units shown above, subtract 'base' and divide by 'gain'`_. – Rashid May 19 '15 at 12:09
Can you, please, include the link, citation and reference where they exactly say something about *base* and *gain*. Through the ATM tool, I cannot extract any info file. There may be a case that they have depreciated the info file here. – Léo Léopold Hertz 준영 May 21 '15 at 21:55
I think this is the source what are you recalling http://www.physionet.org/physiotools/matlab/rddata.m It does this base and gain removals in this code. I cannot find any other source for this thing. – Léo Léopold Hertz 준영 May 21 '15 at 22:04
@Masi, on my second picture in the post it says: _download these files:_ there is an _.info_ file, please check that file. – Rashid May 22 '15 at 05:08
Please, see AmidJuneja's answer below. It is possible that you do not need to remove base and gain. His argument is solid. Silva's tree is a newer one of the toolbox. I opened a ticket about the confusion here https://github.com/ikarosilva/wfdb-app-toolbox/issues/119 – Léo Léopold Hertz 준영 Apr 25 '16 at 21:25

score 4 · Answer 2 · answered Jun 09 '11 at 05:51

4

You need the program rddata.m (MATLab script) from this website. The program can be found here. rddata.m is probably the only program you will need to read the ecg signals. I remember having used this program and database myself not too long ago.

answered Jun 09 '11 at 05:51

Sriram

10,298
21
83
136

score 2 · Answer 3 · answered Jun 05 '15 at 04:07

2

There is a tutorial for using matlab to read the data. tutorial for matlab user

install "The WFDB Toolbox for Matlab" from the link above. Add the folder of the toolbox to the path in matlab.
Download the ECG signal. Be sure to download '.atr', '.dat' and '.hea' together for the signal you are to deal with.
Command in matlab is as follows : [tm,signal,Fs]=rdsamp( filename , 1 ) ; [ann,type]=rdann( filename , 'atr' ) ; Note: for signal '101', its name is '101'. And you can check the detail information about rdsamp and rdann from the tutorial.

answered Jun 05 '15 at 04:07

Niu

31
4

Here about how to download those files locally http://stackoverflow.com/a/36706214/54964 so use `physionetdb("mitdb", 1);` before running `rdsamp(...)` to reduce network load and speed up your system. I am studying dbpath because there seem to be changes in different trees. I want that many projects can use local files to speed up the system. Here, the ticket https://github.com/ikarosilva/wfdb-app-toolbox/issues/117 in Ikaro's tree. – Léo Léopold Hertz 준영 Apr 25 '16 at 21:12

Amit Juneja · Answer 4 · 2016-06-13T02:57:15.573

2

So I read this answer 3 months ago and removed the base and gain. It turns out , i completely shifted my R-peaks in various directions, screwing up all my results. While I am not sure if doing this is necessary in matlab or not, DO NOT DO THIS if you are not preprocessing your signal in matlab. I was preprocessing my signal in python, and all I did to normalizae it was

val = val/2047  % (2047 is the max volt range of signals)

and used butterworth filters to remove artifacts (range 0.5hz-45hz)

CORRECTION

The cutoff i selected is 0.5 to 45 not 5-15 as I previously reported. This cutoff preserves the QRS for various arrhythmias without adding too much noise

# baseline correction and bandpass filter of signals 
lowpass = scipy.signal.butter(1, highfreq/(rate/2.0), 'low') 
highpass = scipy.signal.butter(1, lowfreq/(rate/2.0), 'high') 

# TODO: Could use an actual bandpass filter 
ecg_low = scipy.signal.filtfilt(*lowpass, x=ecg) 
ecg_band = scipy.signal.filtfilt(*highpass, x=ecg_low)

edited Jun 13 '16 at 02:57

answered Feb 12 '16 at 13:40

Amit Juneja

380
2
5

Can you give your command about `butterworth` to remove artifacts in the range [0.5-45 Hz]`, please. I want to study it because the implementation varies in applications. I do not trust that the filtering is necessary here. – Léo Léopold Hertz 준영 Apr 25 '16 at 21:10
If you were doing HRV analysis of the signals, should you use the normalization `val=val/2047;`? I am using `sig=(sig-1024)/200;` at the moment, and really worrying that this is wrong. Your `val=val/2047` makes sense because max volt is 2^11 -1. I would like to understand why Rashid claims the other thing. – Léo Léopold Hertz 준영 Apr 25 '16 at 21:15
I opened a new ticket about the thing in Silva's tree because I really need an authoritative answer about the confusion. https://github.com/ikarosilva/wfdb-app-toolbox/issues/119 – Léo Léopold Hertz 준영 Apr 25 '16 at 21:24
Hi Sorry i just saw this today See the problem is that if you are detecting R-peaks in your HRV analysis you don't wanna divide it by 200 because then each beat's R-peak position will be shifted and not synced. I used the scipy,signal lib in python for butterworth. I am pasting my function in the next comment for your reference – Amit Juneja Jun 13 '16 at 02:50

score 0 · Answer 5 · answered Jan 25 '17 at 00:35

0

just use it

A=input('Enter Variable: ','s');
load(A);
a=(val(1,:));
b=fir1(100,[0.1,0.25],'stop');
y2=filter(b,1,a);
figure;
plot(y2);

answered Jan 25 '17 at 00:35

Harun

3
4

When giving an answer it is preferable to give some explanation as to *WHY* your answer is the one. – Stephen Rauch Jan 25 '17 at 00:57

score -1 · Answer 6 · edited Jun 20 '20 at 09:12

Use ATM to extract .mat as described by Kamtal (now known Rashid). However, note that to see the .info file in some cases, you need to click the arrow

enter image description here

After I pushed this forward to developers here, we got improvements in the documentation here in Section 4.

If they are all integers in the range [-2^N, 2^N-1 ] or [ 0, 2^N ], they are probably digital. Compare the values to see if they are in the expected physiological range of the signal you are analyzing. For example, if the header states that the signal is an ECG stored in milivolts, which typically has an amplitude of about 2mV, a signal of integers ranging from -32000 to 32000 probably isn't giving you the physical ECG in milivolts...

If they are not integers then they are physical. Once again you can quickly compare the values to see if they are in the expected physiological range of the signal you are analyzing.

0-9-10 wfdb - physical units

We say that signals are in 'physical units' when the values are used to represent the actual real life values as closely as possible, although obviously everything on the computer is digital and discrete rather than analogue and continuous. This includes our precious 64 bit double precision floating point values, but this is as close as we can get and already very close to the actual physical values, so we refer to them as 'physical'.

-

For example, if a 15 bit signal is collected via a capturing device, Physionet will likely store it as a 16 bit signal. Each 16 bit block stores an integer value between -2^15 and 2^15-1, and using the gain and offset stated in the header for each channel, the original physical signal can be mapped out for processing.

The default units are now physical units where base and gain should be added stated in the header for each channel, so the physical signal can be mapped out for processing.

% rawUnits
%       A 1x1 integer (default: 0). Returns tm and signal as vectors
%       according to the following values:
%               rawUnits=0 - Uses Java Native Interface to directly fetch  data, returning signal in physical units with double precision.
%               rawUnits=1 -returns tm ( millisecond precision only! ) and signal in physical units with 64 bit (double) floating point precision
%               rawUnits=2 -returns tm ( millisecond precision only! ) and signal in physical units with 32 bit (single) floating point  precision
%               rawUnits=3 -returns both tm and signal as 16 bit integers (short). Use Fs to convert tm to seconds.
%               rawUnits=4 -returns both tm and signal as 64 bit integers (long). Use Fs to convert tm to seconds.

rawUnits=1, rawUnits=2 use also physical units. rawUnits=3, rawUnits=4 use then again analog/digital units where you need to remove base and gain. If you use rawUnits=1 or rawUnits=2, you need to adjust for base and gain where base = 1024 and gain = 200

# Kamtal's method in considering base and gain
load('201m.mat');
val = (val - 1024)/200;    % you have to remove "base" and "gain"
ECGsignal = val(1,16:950); % select the lead (Lead I)

See the .info file below where you can get the base and gain. There is also the unit mV which suggests the values should be near 2 after the base-gain operations.

<0-9-9 wfdb - analog/digital units so base and gain by default; now only `rawUnits=3,4` for analog units

After selection ATM, you should be able to see the list where you can select .info file after the export as described in Kamtal's answer. The .info file instructs to remove so-called base and gain from the data before use

Source: record mitdb/201  Start: [00:02:10.000]
val has 2 rows (signals) and 3600 columns (samples/signal)
Duration:     0:10
Sampling frequency: 360 Hz  Sampling interval: 0.002777777778 sec
Row     Signal  Gain    Base    Units
1       MLII    200     1024    mV
2       V1      200     1024    mV

To convert from raw units to the physical units shown
above, subtract 'base' and divide by 'gain'.

Comparing wrong answers here! [Deprecated]

Kamtal (now called Rashid) answer is about the old wfdb system which used digital units without removal of base and gain

# Kamtal's method in considering base and gain
load('201m.mat');
val = (val - 1024)/200;    % you have to remove "base" and "gain"
ECGsignal = val(1,16:950); % select the lead (Lead I)

# Method without considering base and gain
load('201m.mat');
ECGsignal2 = val(1,16:950); 

# http://www.mathworks.com/matlabcentral/fileexchange/10502-image-overlay
imshow(imoverlay(ECGsignal, ECGsignal2, uint8([255,0,0])))

and you get the difference between my method and his method

enter image description here

Load MIT-BIH Arrhythmia ECG database onto MATLAB

6 Answers6

0-9-10 wfdb - physical units

<0-9-9 wfdb - analog/digital units so base and gain by default; now only `rawUnits=3,4` for analog units

Comparing wrong answers here! [Deprecated]

Linked

Related

Load MIT-BIH Arrhythmia ECG database onto MATLAB

6 Answers6

0-9-10 wfdb - physical units

<0-9-9 wfdb - analog/digital units so base and gain by default; now only rawUnits=3,4 for analog units

Comparing wrong answers here! [Deprecated]

Linked

Related

<0-9-9 wfdb - analog/digital units so base and gain by default; now only `rawUnits=3,4` for analog units