1

My goal is to merge 4 excel worksheets into 1 based on similar Hostnames, Serial Number, Category... I'm using the pandas merge function below.

InventoryDf = pd.read_excel("Inventory.xlsx", sheet_name='Inventory')
SoftwareDf = pd.read_excel("Inventory.xlsx", sheet_name='Software')
HardwarewareDf = pd.read_excel("Inventory.xlsx", sheet_name='Hardware')
CoverageDf = pd.read_excel("Inventory.xlsx", sheet_name='Coverage')
data_frames = [InventoryDf, SoftwareDf, HardwarewareDf, CoverageDf]
merge = partial(pd.merge, on=['Priority','Category','Product Family','Host Name','Serial Number'], how='outer')
merge = reduce(merge, data_frames)

The issue is that each worksheet has a column "IP Address" with mostly similar IPs. For some reason, the merge Dataframe containes 4 columns, with 2 duplicate names: "IP Address_x","IP Address_x","IP Address_y","IP Address_y"

I want to merge those 4 columns into 1 but I can't because they have similar names. I don't have the rename them manually because there are ~30 dataframe columns and it's tedious.

Is there a say to merge them so that:

  1. If the ip is the same, merge it
  2. If the IP is different, use the first "IP Address_x" column on the left
  3. If one column is missing, just the first "IP Address_x" if IP is not empty

This is a sample of the worksheet, I have more columns, like: Name, Url, Site Name, City...

InventoryDf

+-----------+---------------+------------+----------+----------+
| Host Name | Serial Number | IP Address | Priority | Category |
+-----------+---------------+------------+----------+----------+
| SwitchA   | 1230          | 1.1.1.1    | 1        | Switch   |
+-----------+---------------+------------+----------+----------+
| SwitchA   | 1231          | 1.1.1.1    | 1        | Switch   |
+-----------+---------------+------------+----------+----------+
| SwitchB   | 1240          | 1.1.1.2    | 2        | Switch   |
+-----------+---------------+------------+----------+----------+

HardwareDf

+-----------+---------------+------------+----------+----------+
| Host Name | Serial Number | IP Address | Priority | Category |
+-----------+---------------+------------+----------+----------+
| SwitchA   | 1230          | 1.1.0.1    | 1        | Switch   |
+-----------+---------------+------------+----------+----------+
| SwitchD   | 1250          | 1.2.2.2    | 1        | Switch   |
+-----------+---------------+------------+----------+----------+
| SwitchE   | 1260          | 1.3.3.3    | 2        | Switch   |
+-----------+---------------+------------+----------+----------+

SoftwareDf

+-----------+---------------+------------+----------+----------+---------+
| Host Name | Serial Number | IP Address | Priority | Category | Version |
+-----------+---------------+------------+----------+----------+---------+
| SwitchA   | 1230          | 1.1.1.1    | 1        | Switch   | X       |
+-----------+---------------+------------+----------+----------+---------+
| SwitchA   | 1231          | 1.1.1.1    | 1        | Switch   | X       |
+-----------+---------------+------------+----------+----------+---------+
| SwitchB   | 1240          | 1.1.1.2    | 2        | Switch   | Y       |
+-----------+---------------+------------+----------+----------+---------+

CoverageDf

+-----------+---------------+------------+----------+----------+-------------+-------+
| Host Name | Serial Number | IP Address | Priority | Category | Coverage    | Price |
+-----------+---------------+------------+----------+----------+-------------+-------+
| SwitchA   | 1230          | 1.1.1.1    | 1        | Switch   | Not Covered | 100   |
+-----------+---------------+------------+----------+----------+-------------+-------+
| SwitchA   | 1231          | 1.1.1.1    | 1        | Switch   | Covered     | 300   |
+-----------+---------------+------------+----------+----------+-------------+-------+
| SwitchB   | 1240          | 1.1.1.2    | 2        | Switch   | Not Covered | 200   |
+-----------+---------------+------------+----------+----------+-------------+-------+

Expected result (IP Address are merged even though some are different for SwitchA)

+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| Host Name | Serial Number | IP Address | Priority | Category | Version | Coverage    | Price |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchA   | 1230          | 1.1.1.1    | 1        | Switch   | X       | Not Covered | 100   |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchA   | 1231          | 1.1.1.1    | 1        | Switch   | X       | Covered     | 300   |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchB   | 1240          | 1.1.1.2    | 2        | Switch   | Y       | Not Covered | 200   |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchD   | 1250          | 1.2.2.2    | 1        | Switch   |         |             |       |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+
| SwitchE   | 1260          | 1.3.3.3    | 2        | Switch   |         |             |       |
+-----------+---------------+------------+----------+----------+---------+-------------+-------+

RAW extract of the result. Notice losts of redundant columns, IP Address_x

 Source.Name_x  Priority Item Type_x  Category                            Product Family       Product ID_x Software Type_x OS Version_x Suggested Version 1_x                       Host Name  IP Address_x Serial Number                                                  Source.Name_y       Product ID_y Software Type_y OS Version_y           Current Milestone_x Suggested Version 1_y Suggested Version 2 Suggested Version 3  IP Address_y SW End of Life SW End of Sale                                                                                                                    URL_x                                                  Source.Name_x IP Address_x Item Type_y      Product ID_x Current Milestone_y   Hardware Lifecycle Status    Replacement PID Replacement PID Info  Replacement PID Price  Replacement PID Price Discount Replacement PID Service Level  Replacement PID Service Price  Current PID Service Price  Replacement PID Service Price Discount HW End of Life HW End of Sale                                                                                                                    URL_y                                                  Source.Name_y Item Type       Product ID_y  IP Address_y      Coverage Status    Contract Status    Contract Number Coverage Start Date Coverage End Date SLA type
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches   WS-C2960X-24PS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF5-A1    10.1.1.8   XXXXX  software_02_Jul_07_54_15.xlsx   WS-C2960X-24PS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>    10.1.1.8     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis   WS-C2960X-24PS-L    10.1.1.8    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches  WS-C2960X-48LPS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF6-A1    10.1.1.9   YYYYY  software_02_Jul_07_54_15.xlsx  WS-C2960X-48LPS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>    10.1.1.9     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis  WS-C2960X-48LPS-L    10.1.1.9    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches   WS-C2960X-24PS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF7-A1   10.1.1.11   ZZZZZZ  software_02_Jul_07_54_15.xlsx   WS-C2960X-24PS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>   10.1.1.11     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis   WS-C2960X-24PS-L   10.1.1.11    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches   WS-C2960X-24PS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF8-A1   10.1.1.12   QQQQQ  software_02_Jul_07_54_15.xlsx   WS-C2960X-24PS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>   10.1.1.12     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis   WS-C2960X-24PS-L   10.1.1.12    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches  WS-C2960X-48LPS-L             IOS    15.2(4)E6             15.2(7)E2                  SWITCH-IDF9-A1   10.1.1.13   WWWWW  software_02_Jul_07_54_15.xlsx  WS-C2960X-48LPS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>   10.1.1.13     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis  WS-C2960X-48LPS-L   10.1.1.13    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_30_08.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-C Series Switches   WS-C2960C-12PC-L             IOS    15.2(4)E6             15.2(7)E2                   SWITCH-MGK-A1   10.1.1.39   EEEEEE  software_02_Jul_08_14_40.xlsx   WS-C2960C-12PC-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>   10.1.1.39     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html  hardware_02_Jul_07_25_04.xlsx  10.1.1.39     Chassis  WS-C2960C-12PC-L  EoL Date Announced  EOL in more than 24 months  WS-C2960L-16PS-LL                 None                  920.7                              50                          PSUT                         215.16                     122.76                                       0     2025-10-31     2020-10-30           https://www.cisco.com/c/en/us/products/switches/catalyst-2960-c-series-switches/eos-eol-notice-c51-743071.html  coverage_24_Jul_10_37_26.xlsx   Chassis   WS-C2960C-12PC-L   10.1.1.39  Uncovered with ELLW        No Contract        No Contract                 NaT               NaT     None
 inventory_30_Jun_15_19_35.xlsx         3     Chassis  Switches     Cisco Catalyst 2960-X Series Switches  WS-C2960X-48LPS-L             IOS    15.2(4)E6             15.2(7)E2               SWITCH-SRVROOM-A1    10.1.1.2   RRRRRR  software_02_Jul_07_54_15.xlsx  WS-C2960X-48LPS-L             IOS    15.2(4)E6  End of Vulnerability Support             15.2(7)E2                <NA>                <NA>    10.1.1.2     2023-04-30     2018-05-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-4500-series-switches/eos-eol-notice-c51-739919.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_06_49.xlsx   Chassis  WS-C2960X-48LPS-L    10.1.1.2    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 inventory_30_Jun_15_20_39.xlsx         3     Chassis  Switches       Cisco Catalyst 3850 Series Switches     WS-C3850-24P-S          IOS-XE       16.3.7                16.9.5               SWITCH-SRVROOM-C1    10.2.1.254   TTTTTT  software_02_Jul_07_54_33.xlsx     WS-C3850-24P-S          IOS-XE       16.3.7            End of Engineering                16.9.5                <NA>                <NA>    10.2.1.3     2023-07-31     2018-08-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-3850-series-switches/eos-eol-notice-c51-740255.html  software_02_Jul_07_02_48.xlsx   10.2.1.254        <NA>              <NA>  End of Engineering                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>     2023-07-31     2018-08-01  https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-3850-series-switches/eos-eol-notice-c51-740255.html  coverage_24_Jul_10_07_28.xlsx   Chassis     WS-C3850-24P-S    10.2.1.254    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 software_30_Jun_15_21_13.xlsx         1        <NA>  Security  Cisco ASA 5500-X with FirePOWER Services               <NA>             ASA      9.7(1)4        9.12.3 Interim  SRVROOM-FW2.umbrellacorp.com  10.60.127.19   YYYYYY  software_02_Jul_07_55_54.xlsx         ASA5506-K9             ASA      9.7(1)4            End of Engineering        9.12.3 Interim       9.8.4 Interim                <NA>  10.1.122.9     2022-08-31     2017-08-25          http://www.cisco.com/c/en/us/products/collateral/security/asa-firepower-services/eos-eol-notice-c51-738646.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_07_48.xlsx   Chassis         ASA5506-K9  10.60.127.19    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None
 software_30_Jun_15_21_13.xlsx         1        <NA>  Security  Cisco ASA 5500-X with FirePOWER Services               <NA>             ASA      9.7(1)4        9.12.3 Interim  FW2.umbrellacorp.com  10.60.127.18   GGGGGGG  software_02_Jul_07_55_54.xlsx         ASA5506-K9             ASA      9.7(1)4            End of Engineering        9.12.3 Interim       9.8.4 Interim                <NA>  10.1.122.8     2022-08-31     2017-08-25          http://www.cisco.com/c/en/us/products/collateral/security/asa-firepower-services/eos-eol-notice-c51-738646.html                                                           <NA>         <NA>        <NA>              <NA>                <NA>                        <NA>               <NA>                 None                    NaN                            <NA>                          <NA>                            NaN                        NaN                                    <NA>            NaT            NaT                                                                                                                     <NA>  coverage_24_Jul_10_07_48.xlsx   Chassis         ASA5506-K9  10.60.127.18    Covered - Non-IBM  Covered - Non-IBM  Covered - Non-IBM                 NaT               NaT     None

Conrad C
  • 746
  • 1
  • 11
  • 32
  • the formatting of your question means I can't see if your column name is `IP_Address` or `1.1.1.1`. the approach I would use if the former is rename the columns before merge to include the sheetname. i.e. `IP_Address_Coverage` – Rob Raymond Aug 14 '20 at 15:46
  • I tried using an ASCII table generator, but it always gives me this output. Do you know how to format the tables properly @RobRaymond? If I change the column names, I still get 4 different columns for IPs at the merge. I'd like to just have one. – Conrad C Aug 14 '20 at 16:27
  • `print(df.to_string(index=False))` is what I use then paste into a post between triple back quotes – Rob Raymond Aug 14 '20 at 16:29
  • @RobRaymond Thanks – Conrad C Aug 14 '20 at 16:42

1 Answers1

1

Started with advanced techniques using functools. Add inspect to the mix get variable name

  1. iterate over your list of dataframes. Capture the name and rename the IP address column
  2. once have merged dataframe rename left most IP address column
  3. fillna() from other IP address columns and drop them
import inspect
import functools

def retrieve_name(var):
    callers_local_vars = inspect.currentframe().f_back.f_locals.items()
    return [var_name for var_name, var_val in callers_local_vars if var_val is var]

data_frames = [InventoryDf, SoftwareDf, HardwareDf, CoverageDf]
names = []
for df in data_frames:
    n = retrieve_name(df)[1].replace("Df", "")
    names.append(n)
    df.columns = [f"{n} {c}" if c=="IP Address" else c for c in df.columns]
# merge = functools.partial(pd.merge, on=['Priority','Category','Product Family','Host Name','Serial Number'], how='outer')
merge = functools.partial(pd.merge, on=['Priority','Category','Host Name','Serial Number'], how='outer')

merge = functools.reduce(merge, data_frames)

# take column LHS IP Address and rename it to "IP Address", fillna() from all subsequent columns
# then drop them
merge.rename(columns={f"{names[0]} IP Address":"IP Address"}, inplace=True)
for n in names[1:]:
    merge.loc[:,"IP Address"].fillna(merge.loc[:,f"{n} IP Address"], inplace=True)
    merge.drop(columns=f"{n} IP Address", inplace=True)
    

Rob Raymond
  • 29,118
  • 3
  • 14
  • 30