I am analyzing accident reports about roof falls in underground mines using Python Pandas and I want to extract the dimensions of the roof fall from a column called Narrative to a new column called roof fall dimension. Basically, If a number in Narrative is followed by a measuring unit or a symbol for a measuring unit such as (ft, feet, inch, meter, or '), I want to extract that number, I also want to extract the word that follows the units or the symbol of the unit.
Narrative
Fall was 19' wide X 20' long x 7' thick
Fall was approx. 19' W. x 80' L. x 10' H
fall was approx. 5 ft thick, 10 ft wide x 4 ft long
fall is 35 feet long X 5 feet wide X 16 feet in height
roof fall dimension
19' wide, 20' long, 7' thick
19' W, 80 L, 10' H
5 ft thick, 10 ft wide, 4 ft long
35 feet long, 19 feet wide, 16 feet height
I am having trouble figuring out how to check if the number that I want to extract is followed by any of the following ((ft, feet, inch, meter, or ') and my regular expression experience is limited.
thanks a lot for helping out.