Im doing some text analysis and trying to loop through a dataframe consisting of a list of words in one column and some numeric values in other columns. I want to split out all words in the list column on to different rows and also bring with them the values that were on the same row. I want the code to be usable by others I share with so I've made the code so that they have to input needed columns only once earlier in the code.
I've managed to loop through the dataframe, split out the words and attribute the values when I specify the column names but when I try and make the loop dynamic I cant seem to get the syntax right:
TokensTable = pd.DataFrame({'Token': [], 'Value1': [],'Value2': [],'Value3': []})
counter = 0
for index, row in metricsByToken2.iterrows(): #for each row in the dataframe with values and the token lists
for index2, token in enumerate(row[0]): #for each token in the list of tokens in each row
if token.isalpha(): #If the token doesnt contain punctuation then
token = token.lower() #lowercase the token
if token in stop_words: #if the token is a stop word then
del token #delete the token
else:
TokensTable.loc[counter] = [row[0][index2]] + [row[1]] + [row[2]] + [row[3]]
counter = counter + 1 #increase counter to move to the next row in new df
else:
del token
So if there was the list ['A','B','C'] with other columns 200,300,400 then I would want that split on to 3 separate rows for example: 'A',200,300,400 then 'B',200,300,400 and 'C',200,300,400.
This code above is what has worked for me so far but I have manually specified [Row[1] + [Row[2] etc. [row[0][index2]] will be there for every time I run the code so that has to stay but the number of other columns added on the same row will change each time. The number of columns needed will always be as far as len(TokensTable)-1 though so I need to loop somehow from 0 to len(TokensTable)-1 Im guessing but so far I've had no luck figuring this out so any help would be really appreciated
Example Input:
╔══════════════════╦════════╦════════╦════════╗
║ Text ║ Value1 ║ Value2 ║ Value3 ║
╠══════════════════╬════════╬════════╬════════╣
║ ['A','B','C'] ║ 1 ║ 3 ║ 7 ║
║ ['A1','B1','C1'] ║ 2 ║ 4 ║ 8 ║
║ ['A2','B2','C2'] ║ 3 ║ 5 ║ 9 ║
╚══════════════════╩════════╩════════╩════════╝
Example Output:
╔═══════╦════════╦════════╦════════╗
║ Token ║ Value1 ║ Value2 ║ Value3 ║
╠═══════╬════════╬════════╬════════╣
║ A ║ 1 ║ 3 ║ 7 ║
║ B ║ 1 ║ 3 ║ 7 ║
║ C ║ 1 ║ 3 ║ 7 ║
║ A1 ║ 2 ║ 4 ║ 8 ║
║ B1 ║ 2 ║ 4 ║ 8 ║
║ C1 ║ 2 ║ 4 ║ 8 ║
║ A2 ║ 3 ║ 5 ║ 9 ║
║ B2 ║ 3 ║ 5 ║ 9 ║
║ C2 ║ 3 ║ 5 ║ 9 ║
╚═══════╩════════╩════════╩════════╝