I'm trying to loop through multiple excel files in a folder, write to a new data frame, then filter by column name. It should only show the property address in the first column, then 3 other columns with corresponding dollar values below the headers. The issue is that each file may not have the same columns or they're duplicate. Here's my code:
read_files = glob.glob('C:/Users/watso/Desktop/yardi_testing_desktop/yardi_downloads/*.xlsx')
files_xlsx = [f for f in read_files if f[-4:] == "xlsx"]
all_data = pd.DataFrame()
for f in files_xlsx:
df = pd.read_excel(f)
df = df.transpose()
new_header = df.iloc[0] # grab the first row for the header
df = df[1:] # take the data less the header row
df.columns = new_header # set the header row as the df header
df.columns = df.columns.astype(str).str.strip()
df = df.iloc[:2, 3:]
all_data = all_data.append(df, sort=True)
all_data.to_csv('C:/Users/watso/PycharmProjects/yardi_noi/yyy.csv', index=False, header=0)
Here's the data frame it creates:
----------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| | | ASSETS | | RENTAL REVENUES | | RENTAL INCOME | Rental Inc: Base Rent | Rental Inc: Telecom Transmit | Total Rental Revenue | | TENANT FUNDED IMPROVEMENTS | Tenant Funded Improvements | Total Tenant Funded Improvements | | STRAIGHT LINE RENT | Straight Line Rent |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| 64 Fake Street - Stony Village(6546-01) | Actual | | | | | | 208598.27 | 2519.7 | 211117.97 | | | 4963.73 | 4963.73 | | | -18726.86 |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| | | ASSETS | | RENTAL REVENUES | | PROPERTY OPERATING EXPENSES | | REIMBURSABLE EXPENSES | | OTHER (INCOME) / EXPENSE | | OTHER (INCOME) | | | | |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| 234 Fake Street - Santa Wilshire(6668-01) | Actual | | | | | | | | | | | | | | | |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| | | ASSETS | | RENTAL REVENUES | | RENTAL INCOME | Rental Inc: Base Rent | Rental Inc: Incidental | Total Rental Revenue | | Total Rental Revenues | | OTHER INCOME | Oth Inc: Miscellaneous | Total Other Income | |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| 255 Fake Street - Minneapolis(9879-02) | Actual | | | | | | -12490.27 | -1060 | -13550.27 | | -13550.27 | | | 100 | 100 | |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| | | ASSETS | | RENTAL REVENUES | | RENTAL INCOME | Rental Inc: Base Rent | Total Rental Revenue | | TENANT FUNDED IMPROVEMENTS | Tenant Funded Improvements | Total Tenant Funded Improvements | | STRAIGHT LINE RENT | Straight Line Rent | Total Straight-Line Rent |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| 543 Fake Street - Seattle(6684-01) | Actual | | | | | | 339802.38 | 339802.38 | | | 39841.64 | 39841.64 | | | 1645.96 | 1645.96 |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
The problem is when I try to select columns by loc() but get an error stating that the columns I choose aren't in the index. Here is how I want the table to look:
+-------------------------------------------+------------------------+----------------------+----------------------------+
| | Rental Inc: Base Rent | Total Rental Revenue | Tenant Funded Improvements |
+-------------------------------------------+------------------------+----------------------+----------------------------+
| 64 Fake Street - Stony Village(6546-01) | 208598.27 | 211117.97 | 4963.73 |
+-------------------------------------------+------------------------+----------------------+----------------------------+
| 234 Fake Street - Santa Wilshire(6668-01) | Nan | Nan | Nan |
+-------------------------------------------+------------------------+----------------------+----------------------------+
| 255 Fake Street - Minneapolis(9879-02) | -12490.27 | -13550.27 | Nan |
+-------------------------------------------+------------------------+----------------------+----------------------------+
| 543 Fake Street - Seattle(6684-01) | 339802.38 | Nan | 39841.64 |
+-------------------------------------------+------------------------+----------------------+----------------------------+
I've been trying to figure this out for almost two weeks before I'm asking for help here. Let me know if you need more info. I'd greatly appreciate any help.