0

df1 is the dataframe having around 15,000 rows with 2 columns of following data in a specific format. Each product number has an associated description. Showing here only 3 rows for instance:

"Product Number (col 1 name)"
A1B2C3D4 123

"Description (col 2 name)"
App 2p 34 serv

"Product Number (col 1 name)"
X1Y2Z3D4 312

"Description (col 2 name)"
Sams 2p 34 serv

"Product Number (col 1 name)"
X1Y2Z3P4 312

"Description (col 2 name)"
Nok 2p 34 ser


With this, I have a template dataframe having few specific rows (10 in number) in the following format. Again 2 columns. Here also, each code has associated description. Showing here only 3 rows for instance:

"Partcode (col 1 name)"
A1B2C3D4

"Description (col 2 name)"
FC

"Partcode (col 1 name)"
X1Y2Z3D4

"Description (col 2 name)"
NC

"Partcode (col 1 name)"
X1Y2Z3P4

"Description (col 2 name)"
PC


With this I have few keywords: App, Sams and Nok.


Using pandas, I am trying to apply the following filters in df1:
codes from template (col 1) in col 1 of df1 and using keywords in col 2 of df1 one by one. The ones filtered out, I am appending the description of col 2 in template with col2 of description in df1

For each keyword I am filtering all partcodes. In the resulting excel, I would like to have only 2 columns which should contain partcodes col 1 (that are filtered out) and the appended description in col2 in different excels as per the description of col2 of template dataframe

I have written the following code for single logic and its generating the output in excel. However, I am not able to scale it for multiple partcodes and keywords. Kindly support to answer.

for i in template.index: 
    if(template['Partcode'][i].__contains__('A1B2C3D4')):
        to_be_concatenated = template['Description'][i]

list1 = []
list2 = []

for i in df1.index:
    if(df1['Product Number'][i].__contains__('A1B2C3D4') & df1['Description'][i].__contains__("App")):
        list1.append(to_be_concatenated + "" + df1['Description'][i])
        list2.append(df1['Product Number'][i])

final1 = list(zip(list2,list1))

output1 = pd.DataFrame(final1, columns = ['Product Number','Description']) 

writer = pd.ExcelWriter('C:\\Desktop\\App.xlsx', engine='xlsxwriter')

output1.to_excel(writer, sheet_name='FC', index = False)
writer.save()
  • hi. share a sample of the original dataframe (print(df)), along with your expected output. Have a look at this for guidance : https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – sammywemmy Jan 23 '20 at 08:59
  • @sammywemmy: I explained the dataframes for input and output here: https://docs.google.com/spreadsheets/d/1EgoSmfN38PQM_IJhRz5T1BXFg2UdwqmL6Ib7vHYq8gE/edit?usp=sharing Please let me know if this helps. – Bhuvie Chhabra Jan 24 '20 at 09:37

0 Answers0