0

I have two yaml files names that have similar structure but with different data. I need to parse out the ip and name of each host and put into a single csv (excel) file as three columns.

sample:

instances:
- host: 173.20.1.1
  timeout: 1.0
  tags:
  - ip:173.20.1.1
  - env:prod
  - type:virtual
  - name:2-base
  - hardware:server
- host: 174.28.2.2
  timeout: 1.0
  tags:
  - ip:174.28.2.2
  - env:prod
  - type:virtual
  - name:2-game
  - hardware:server
- host: 174.28.32.8
  timeout: 1.0
  tags:
  - ip:174.28.32.8
  - env:prod
  - type:virtual
  - name:2-play
  - hardware:server

Expected output:

https://i.postimg.cc/nLKrppsv/output-csv-Excel.png

I reviewed similar questions at these links but I'm stumped:

Need a script that extracts from a yaml file content and output as a csv file

Convert several YAML files to CSV

This is current code:

import yaml
import csv
import glob


yaml_file_names = glob.glob('./*.yaml')

rows_to_write = []

for idx, each_yaml_file in enumerate(yaml_file_names):
    print("Processing file ", idx+1, "of", len(yaml_file_names), "file name:", each_yaml_file)
    with open(each_yaml_file) as f:
        data = yaml.safe_load(f)
        
        for each_dict in data['instances']:
            for each_nested_dict in each_dict['host']:
                for each_option in each_nested_dict['tags']:
                    #write to csv yaml_file_name, each_nested_dict['tags'], each_option
                    rows_to_write.append([each_yaml_file, each_nested_dict['ip'], each_option])
                    rows_to_write.append([each_yaml_file, each_nested_dict['name'], each_option])



with open('output_csv_file.csv', 'w') as out:
    csv_writer = csv.writer(out, delimiter='|', quotechar=' ')
    csv_writer.writerows(rows_to_write)
    print("Output file output_csv_file.csv created")

I get this error:

Processing file  1 of 2 file name: .\conf.yaml
Traceback (most recent call last):
  File "C:\NGSC\yaml2csv3.py", line 21, in <module>
    for each_option in each_nested_dict['tags']:
builtins.TypeError: string indices must be integers

I know there are a lot of threads on "string indices must be integers", but I haven't been successful with them so far.

PythonDawg
  • 5
  • 2
  • 5

1 Answers1

1

you can try this method

import yaml
import csv
import glob


yaml_file_names = glob.glob('./*.yaml')

rows_to_write = []

for i, each_yaml_file in enumerate(yaml_file_names):
    print("Processing file {} of {} file name: {}".format(
        i+1, len(yaml_file_names),each_yaml_file))

    with open(each_yaml_file) as file:
        data = yaml.safe_load(file)
        for instance in data["instances"]:
            values=dict()
            for tag in instance["tags"]:
                tag_for_check=tag.split(":")
                
                if tag_for_check[0] == "ip":
                    values["ip"] = tag_for_check[1]
                    continue

                elif tag_for_check[0] == "name":
                    values["name"] = tag_for_check[1]

            rows_to_write.append([instance["host"],values["ip"],values["name"]])


with open('output_csv_file.csv', 'w', newline='') as out:
    csv_writer = csv.writer(out)
    csv_writer.writerow(["host","ip","name"])
    csv_writer.writerows(rows_to_write)
    print("Output file output_csv_file.csv created")

output :

host,ip,name
173.20.1.1,173.20.1.1,2-base
174.28.2.2,174.28.2.2,2-game
174.28.32.8,174.28.32.8,2-play

enter image description here

xio
  • 630
  • 5
  • 11