I have a dataframe that contains a single series (column) of strings that are white space separated and concatenated together.
The dataframe column looks like:
[In]: df
[Out]:
0 -- Generated
1 -- Formatting
2 -- Project1
3 -- Help
4 GRI -- Gen
5 S-P -- Gen
6 COORD-SYS -- Gen
7 COORD -- Gen
8 318108.6945 6146696.895 1829.01714 318108.6945 6146696.895
9 1917.90444 334108.6945 6146696.895 1610.670059 334108.6945
10 6146696.895 1713.102355 350108.6945 6146696.895 1417.752346 350108.6945
11 NEF -- Gen
12 318108.6945 6146696.895 1829.01714 318108.6945 6146696.895
2025.480832 326108.6945 6146696.895 1716.342492 326108.6945
13 1917.90444 334108.6945 6146696.895 1610.670059 334108.6945
6146696.895 1813.845155 342108.6945 342108.6945
.
.
I'd like to separate this one column dataframe by the strings that aren't numbers and take all the numerical values underneath them until the next non-numeric value appears.
The output I need (for example for COORD and similar for NEF) would look like this:
[In]: df_COORD
[Out]:
0 COORD 1 2 3 4 5
1 318108.6945 6146696.895 1829.01714 318108.6945 6146696.895 nan
2 1917.90444 334108.6945 6146696.895 1610.670059 334108.6945 nan
3 6146696.895 1713.102355 350108.6945 6146696.895 1417.752346 350108.6945
I've first attempted at using something like:
df_COORD = pd.DataFrame(df[0][7:11]).reset_index(drop=True)
df_COORD= df_COORD.rename(columns=df_COORD.iloc[0]).drop(0)
df_COORD= df_COORD[file_coord.iloc[0][0]].str.split(expand=True)
Which works but I want to not have to explicitly state which rows I want; I'd like to separate based off the column value containing COORD
or NEF
or any other list
of unique strings and grab data underneath until it reaches another lettered only string.
Is there any way I can do this?