0

I have two dataframes that I created in Pandas using python. And I want to combine them into one dataframe with the columns Name, PrivateIP and Memory.

hosts_df:

***Hosts Data Frame:
                      Name     PrivateIP
0               bastion001     10.238.2.166
1          logicmonitor001     10.238.2.52
2          logicmonitor002     45.21.2.1
***

memory_df:

***Memory Data Frame:
     Memory
0   1843260
0   7706164
0   7904828
***

Also I want to get rid of the column of zeros in the memory_df.

Here is my code.

code

    import os
    import pandas as pd
    filelist = os.listdir(text_path)
    # Read the servers into the DF
    hosts_list = os.path.join('..', '..', 'source_files', 'aws_hosts_list', 'aws_hosts_list.csv')
    text_path = "/home/tdun0002/stash/cloud_scripts/aws_scripts/output_files/memory_stats/text/"                
    hosts_df = pd.read_csv(hosts_list, skipinitialspace=True)
    hosts_df.columns = ["Name", "PrivateIP"]
    # Create the memory dataframe
    column_names = ["Memory"]
    memory_df = pd.DataFrame(columns=column_names)
    print(f"Reading text files into the Memory DF")
    for filename in filelist:
        print(f"Adding filename: {filename}")
        filename = text_path + filename
        temp_df = pd.read_csv(filename, delim_whitespace=True, names=column_names)
        memory_df = memory_df.append(temp_df)

    memory_df.Memory = memory_df.Memory.astype("int32")

How can I combine these datfames into one with the columns: Name, PrivateIP, Memory?

bluethundr
  • 1,005
  • 17
  • 68
  • 141
  • don't repeatedly append data frames to each other. read them all into a standard python list and then use `pandas.concat` at the end – Paul H Jun 01 '20 at 04:16

1 Answers1

1

Try this.

  • Reset index of memory_df dataframe.
  • Concatenate the two dataframes: hosts_df and memory_df.
pd.concat([hosts_df, memory_df.reset_index(drop=True)], axis=1)

Code: Example

# Dummy Data
hosts_df = pd.DataFrame({'Name': ['bastion001', 'logicmonitor001', 'logicmonitor002'], 
                      'PrivateIP': ["10.238.2.166", "10.238.2.52", "45.21.2.1"]})
memory_df = pd.DataFrame({'Memory': [1843260, 7706164, 7904828]}).reindex(index=[0,0,0])

# Concat dataframes
df = pd.concat([hosts_df, memory_df.reset_index(drop=True)], axis=1)
print(df)

Output:

              Name     PrivateIP   Memory
0       bastion001  10.238.2.166  1843260
1  logicmonitor001   10.238.2.52  1843260
2  logicmonitor002     45.21.2.1  1843260
CypherX
  • 7,019
  • 3
  • 25
  • 37
  • @bluethundr Is this what you are looking for? – CypherX Jun 01 '20 at 04:14
  • Yes, that's perfect. Thanks! Question for you tho. On the servers that I can't log into and can't get a memory reading for, I see this in the dataframe: `B-surveys-1c2e 10.238.2.39 NaN`. Is there a way I can substitute `NaN` for a different value? Such as `Null`? – bluethundr Jun 01 '20 at 13:59
  • See these two: (1) [`pandas fillna()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html) and (2) [Stackoverflow question on replacing NAN](https://stackoverflow.com/questions/13295735/how-can-i-replace-all-the-nan-values-with-zeros-in-a-column-of-a-pandas-datafram). Combine them to get what you want (if I understand you correctly). – CypherX Jun 02 '20 at 01:13