I imported Cooperative.xlsx into a dataframe called df. The output is:
Rivers Electric Cooperative Inc
Member Type: Distribution2Lebo, KS | Consumers: 125193
4-County Electric Power Association
Member Type: Distribution......
95Muleshoe, TX | Consumers: 895496
Baldwin County EMC
Member Type: Distribution
Summerdale, AL | Consumers: 8236599
There is 2497 rows with cooperative names and associated information
I want the it to look like this:
I want the data to look like this: Rivers Cooperative(column 1) 4 Rivers Electric Cooperative Inc. Member Type (column 2) Distribution, Location(column3) Lebo, KS and Consumers(column4) 12519. I want to iterate through all 2497 rows.
My code looks like this:
import pandas as pd
# Read the Excel file into a DataFrame
df = pd.read_excel('cooperative.xlsx')
# Create a new DataFrame with the desired columns
new_df = pd.DataFrame(columns=['Cooperative', 'Member Type', 'Location', 'Consumers'])
# Iterate through the rows of the original DataFrame
for i, row in df.iterrows():
# Check if the row starts with a cooperative name
if not row[0].startswith('Member Type') and not row[0].startswith('Location'):
# If it does, set the cooperative variable
cooperative = row[0]
else:
# If it doesn't, extract the member type or location/consumers information
if row[0].startswith('Member Type'):
member_type = row[0].split(': ')[1]
else:
location_consumers = row[0].split(' | Consumers: ')
location = location_consumers[0]
consumers = location_consumers[1]
# Check if all the variables have been set
if cooperative and member_type and location and consumers:
# If they have, add them to the new DataFrame and reset the variables
new_df = new_df.append({'Cooperative': cooperative,
'Member Type': member_type,
'Location': location,
'Consumers': consumers},
ignore_index=True)
cooperative = None
member_type = None
location = None
consumers = None
# Print the resulting DataFrame
print(new_df)