Python, Parsing Column Data, pandas

Question

Please see the image below

I am attempting to grab the information from column 1 and column 7 and combine them together. I have done this kind of operation before but as you notice in the image above some of the information form column 1 is spread out over several rows while the information on column 7 is separated line by line. How am I able to group the appropriate information together? so that the desired result is as follows:

x = [('1900ISR', 0), ('2800', 8000, 0), ('2900ISR', 0, 0), ('3900ISR', 0), ('800BB', 0),('ACEAPP',0),('AIR120A',899),('AIRCMN', 59), ('APP',7800),('ASAMID', 5000, 0, 0, 0, 0), ('C4500', 36990),('C6000', 297000, 70000, 12000,0,0, 60000)]

My attempt to do as such was as follows:

df1 = pd.read_excel("EXCELFILENAME", parse_cols = "B")
df2 = pd.read_excel("EXCELFILENAME", parse_cols = "H")

df1 = df1.values.tolist()
df2 = df2.values.tolist()

a = list(zip(df1,df2))
print(a)

However the above code is incorrect as it outputs a list zipped together of the desired columns but it does not account for the multiple lines associated with certain inputs in the first column.

To be more specific see below:

SNIPPET OF OUPUT

[(['1900ISR'], [0.0]), (['2800'], [8000.0]), (['2900ISR'], [0.0]), (['3900ISR'], [0.0]), (['800BB'], [0.0]), (['ACEAPP'], [0.0]), (['AIR120A'], [0.0]), (['AIRCMN'], [0.0]), (['APP'], [899.0])]

Here in the ouput the entry AIR120A should have the associated number 899 however it has the incorrect 0.0 from previous entries that were not associated to their corresponding entries.

Is there a way to achieve my desired result?

Is it viable for you to save this excel as `.csv` and go from there? — joaoavf, Feb 20 '18 at 18:29

score 1 · Accepted Answer · answered Feb 20 '18 at 18:57

1

Use

df.reset_index(inplace=True)

That will move your indices into a column, and from there you can concatenate, and then re-index if neccessary.

answered Feb 20 '18 at 18:57

John R

1,505
10
18

Python, Parsing Column Data, pandas

1 Answers1