0

I'm very new to python (2 days) so bear with me. I'm trying to send yaml from file readCsv.py to getData.yml file. Everything works fine I think but I'm getting quotes around the exported yaml that i sent from readCsv.py to getData.tml. Why's that?

Here's where the magic happens, readCsv.py:

import csv
import pandas
import yaml

""" Reading whole csv file with panda library """
df = pandas.read_csv('chord-progressions.csv')

""" Read in csv, but easy read with pandas """
""" print(df) """

""" Convert csv to yml """
text = yaml.dump(
    df.to_dict(orient='records'),
    sort_keys=False, width=72, indent=4)

print(text)
""" Export the recently converted yml to getData.yml """
with open('getData.yml', 'w') as outfile:
    yaml.dump(text, outfile, default_flow_style=False)

Output in my getData.yml file:

"-   1st chord: 4\n    2nd chord: 1\n    3rd chord: 5\n    4th chord: Alternative\n\
  \    Progression: .nan\n-   1st chord: 4\n    2nd chord: 1\n    3rd chord: 5\n \
  \   4th chord: Catchy\n    Progression: .nan\n-   1st chord: 1\n    2nd chord: 1\n\
  \    3rd chord: 1\n    4th chord: Didgeridoo\n    Progression: .nan\n-   1st chord:\
  \ 6\n    2nd chord: 4\n    3rd chord: 5\n    4th chord: Dreadful\n    Progression:\
  \ .nan\n-   1st chord: 6\n    2nd chord: 2\n    3rd chord: 5\n    4th chord: Dreadful\n\
  \    Progression: .nan\n-   1st chord: 6\n    2nd chord: 2\n    3rd chord: 4\n \
  \   4th chord: Endless\n    Progression: .nan\n-   1st chord: 3\n    2nd chord:\
  \ 4\n    3rd chord: 6\n    4th chord: Energetic\n    Progression: .nan\n-   1st\
  \ chord: 5\n    2nd chord: 1\n    3rd chord: 4\n    4th chord: Folk\n    Progression:\
  \ .nan\n-   1st chord: 6\n    2nd chord: 1\n    3rd chord: 4\n    4th chord: Folk\n\
  \    Progression: .nan\n-   1st chord: 5\n    2nd chord: 4\n    3rd chord: 3\n \
  \   4th chord: Flamenco\n    Progression: .nan\n-   1st chord: 5\n    2nd chord:\
  \ 6\n    3rd chord: 5\n    4th chord: Flamenco\n    Progression: .nan\n-   1st chord:\
  \ 4\n    2nd chord: 3\n    3rd chord: 6\n    4th chord: Grunge\n    Progression:\
  \ .nan\n-   1st chord: 5\n    2nd chord: 1\n    3rd chord: 6\n    4th chord: Jazz\n\
  \    Progression: .nan\n-   1st chord: 4\n    2nd chord: 5\n    3rd chord: 4\n \
  \   4th chord: Love\n    Progression: .nan\n-   1st chord: 4\n    2nd chord: 1\n\
  \    3rd chord: 5\n    4th chord: Memories\n    Progression: .nan\n-   1st chord:\
  \ 5\n    2nd chord: 6\n    3rd chord: 4\n    4th chord: Pop\n    Progression: .nan\n\
  -   1st chord: 6\n    2nd chord: 3\n    3rd chord: 7\n    4th chord: Pop\n    Progression:\
  \ .nan\n-   1st chord: 1\n    2nd chord: 4\n    3rd chord: 5\n    4th chord: Rebellious\n\
  \    Progression: .nan\n-   1st chord: 4\n    2nd chord: 5\n    3rd chord: 5\n \
  \   4th chord: Sad\n    Progression: .nan\n-   1st chord: 5\n    2nd chord: 4\n\
  \    3rd chord: 4\n    4th chord: Sad\n    Progression: .nan\n-   1st chord: 4\n\
  \    2nd chord: 5\n    3rd chord: 4\n    4th chord: Sad\n    Progression: .nan\n\
  -   1st chord: 4\n    2nd chord: 1\n    3rd chord: 1\n    4th chord: Sweet\n   \
  \ Progression: .nan\n-   1st chord: 4\n    2nd chord: 1\n    3rd chord: 4\n    4th\
  \ chord: Simple\n    Progression: .nan\n-   1st chord: 5\n    2nd chord: 5\n   \
  \ 3rd chord: 1\n    4th chord: Simple\n    Progression: .nan\n-   1st chord: 4\n\
  \    2nd chord: 1\n    3rd chord: 4\n    4th chord: Wildside\n    Progression: .nan\n\
  -   1st chord: 1\n    2nd chord: 4\n    3rd chord: 6\n    4th chord: Wistful\n \
  \   Progression: .nan\n-   1st chord: 1\n    2nd chord: 5\n    3rd chord: 7\n  \
  \  4th chord: Moody\n    Progression: .nan\n-   1st chord: 1\n    2nd chord: 7\n\
  \    3rd chord: 6\n    4th chord: Moody\n    Progression: .nan\n"


Ram
  • 4,724
  • 2
  • 14
  • 22
gospecomid12
  • 712
  • 3
  • 11
  • 25

2 Answers2

2

What is happenning here, is that you are dumping your yaml twice. First, on this line:

""" Convert csv to yml """
text = yaml.dump(
    df.to_dict(orient='records'),
    sort_keys=False, width=72, indent=4)

At this point, text is a string with your yaml value.

Then, you dump it again, here:

""" Export the recently converted yml to getData.yml """
with open('getData.yml', 'w') as outfile:
    yaml.dump(text, outfile, default_flow_style=False)

Because what you are dumping is just a string, it will show up in your file as a string. You can either write the string directly into a file:

with open('getData.yml', 'w') as outfile:
    outfile.write(text)

Or, you can dump your csv directly into a yaml, like this:

with open('getData.yml', 'w') as outfile:
    yaml.dump(
        df.to_dict(orient='records'), outfile,
        sort_keys=False, width=72, indent=4, default_flow_style=False)
tituszban
  • 4,797
  • 2
  • 19
  • 30
  • I have another question if you have time answering. This worked code worked well thanks, but if I only want lets say the first line from the csv file to get sent and dumped as yaml code. How do I accomplish that? – gospecomid12 Aug 25 '21 at 11:16
  • Just grab the first row of the dataframe. https://stackoverflow.com/questions/25254016/pandas-get-first-row-value-of-a-given-column/25254087 – tituszban Aug 25 '21 at 16:09
1

text is already YAML; you don't need to encode it a second time.

with open('getData.yml', 'w') as outfile:
    print(text, file=outfile)

or

with open('getData.yml', 'w') as outfile:
    yaml.dump(
        df.to_dict(orient='records'),
        outfile,
        sort_keys=False,
        width=72,
        indent=4
    )
chepner
  • 497,756
  • 71
  • 530
  • 681