0

I want to turn a list of values (defined as modernization_area) into column headers. For example, the modernization_area outputs: A, B, C, D and the want the function to loop through each area by generating columns A, B, C, and D. The variable would ideally replace 'modernization_area' in the last line, but python is not accepting that as a variable.

modernization_list = pd.DataFrame(keyword_table['Modernization_Area'].unique().tolist())

modernization_list.columns = ['Modernization_Area']

x = range(len(modernization_list['Modernization_Area'].unique().tolist()))

for i in x:

    modernization_area = modernization_list._get_value(i, 'Modernization_Area')

    keyword_subset = keyword_table[keyword_table.Modernization_Area == modernization_area]

    keywords = keyword_subset['Keyword'].tolist()

    report_table['a'] = report_table.award_description.str.findall('({0})'.format('|'.join(keywords), flags=re.IGNORECASE)
Prune
  • 76,765
  • 14
  • 60
  • 81
  • Can you share what the original dataframe looks like and what the desired dataframe looks like pls – Joe Ferndz Oct 02 '20 at 03:56
  • Please see [How to provide a reproducible copy of your DataFrame using `df.head(30).to_clipboard(sep=',')`](https://stackoverflow.com/questions/52413246), then **[edit] your question**, and paste the clipboard into a code block. Always provide a [mre] **with code, data, errors, current output, and expected output, as text**. If relevant, plot images are okay. – Trenton McKinney Oct 02 '20 at 04:56

1 Answers1

0

It is not easy to help you because your question is lacking a lot of information. I am assuming hipotheticals keyword_table and report_table. Actually, I don't know if I really got what you truly want. But I hope this piece of code could help:

Block of assumptions:

supposed_keyword_table = pd.DataFrame({'Keyword': ['word1', 'word2', 'word3', 'word4', 'word5', 'word6', 'word7'], 'Modernization Area': ['A', 'B', 'C', 'D', 'A', 'B', 'D']})

supposed_report_table =  pd.DataFrame({'Modernization Area': ['A', 'B', 'C', 'D'], 'Some Value': [1, 2, 3, 4]})

supposed_keyword_table

    Keyword     Modernization Area
0   word1   A
1   word2   B
2   word3   C
3   word4   D
4   word5   A
5   word6   B
6   word7   D

supposed_report_table

    Modernization Area  Some Value
0   A   1
1   B   2
2   C   3
3   D   4

Now, after assumptions, here is what you can do:

keyword_table_by_mod_area = supposed_keyword_table.groupby(['Modernization Area'])['Keyword'].apply(lambda x: '|'.join(x))

supposed_report_table = pd.merge(supposed_report_table, keyword_table_by_mod_area, on='Modernization Area', how='left')

supposed_report_table

    Modernization Area  Some Value  Keyword
0   A   1   word1|word5
1   B   2   word2|word6
2   C   3   word3
3   D   4   word4|word7
O Pardal
  • 647
  • 4
  • 21