0

I am trying to read strings of processed tweets that I have saved in a csv file in this format :

rt mayasolovely woman complain cleaning house amp man always take trash,momma said pussy cats inside doghouse,addicted2guys simplyaddictedtoguys http co 1jl4hi8zmf woof woof hot scally lad,allaboutmanfeet http co 3gzupfumev woof woof hot soles

Each tweet is separated by a comma and I want to pass them all to a 1d list of strings. I tried the following code:

X_new = []
filename = 'output.csv'
with open(filename, 'r') as f:
    reader = csv.reader(f)
    for row in reader:
       X_new.append(row)

print(X_new)

And the result I get is the following.

[['rt mayasolovely woman complain cleaning house amp man always take trash', 'momma said pussy cats inside doghouse', 'addicted2guys simplyaddictedtoguys http co 1jl4hi8zmf woof woof hot scally lad', 'allaboutmanfeet http co 3gzupfumev woof woof hot soles']]

I don't understand why it gives me a 2d string. I wanted it to be able to return this :

['rt mayasolovely woman complain cleaning house amp man always take trash', 'momma said pussy cats inside doghouse', 'addicted2guys simplyaddictedtoguys http co 1jl4hi8zmf woof woof hot scally lad', 'allaboutmanfeet http co 3gzupfumev woof woof hot soles']

What is going wrong here, thank you in advance!

thelaw
  • 570
  • 4
  • 14
  • That's because you append row and rows are lists. – cglacet Mar 20 '19 at 21:07
  • 1
    Because the `csv` module represents rows as lists. Since you have a single row, you get a single list appended to your outer list. If you are sure you only ever have one row and you're happy with that, don't append `row` to anything and use it directly – roganjosh Mar 20 '19 at 21:07
  • Usually a CSV file has multiple lines, and each line represents a collection of related data. You probably should be putting each tweet on its own line, not as a field in one line. – Barmar Mar 20 '19 at 21:11
  • @roganjosh thank u that solved it! – thelaw Mar 20 '19 at 21:14
  • @Barmar you mean that I should change the data I have in the saved file? – thelaw Mar 20 '19 at 21:14
  • Yes, that's what I'm suggesting. It doesn't need to be CSV if you're not saving sets of related data. – Barmar Mar 20 '19 at 21:23
  • My data is coming from a 1d list called X and so I use `wr.writerow(X)` to write them all in the csv file. Is there a way to put each tweet in a different line? – thelaw Mar 20 '19 at 21:26
  • Related but not exactly a dupe I think: https://stackoverflow.com/a/24662707/1317713 – Leonid Mar 20 '19 at 21:30
  • There is _nothing inherently wrong_ with saving data as a single line CSV. It is not a cumbersome file format, it does not require any external dependencies and is transferable virtually anywhere. Unless you have pressures for concurrent writes or anything else, I really think the discussion about changing formats is only going to push you off-course – roganjosh Mar 20 '19 at 22:17

1 Answers1

1

A row is a list, you are trying to append a row, instead of this, try to extend your original list with the row.

X_new = []
filename = 'output.csv'
with open(filename, 'r') as f:
    reader = csv.reader(f)
    for row in reader:
       X_new.extend(row)

print(X_new)
Alakazam
  • 461
  • 5
  • 14
  • I have not checked but `X_new = list(reader)` would probably work - it would be shorter and probably faster. Update: https://stackoverflow.com/a/24662707/1317713 – Leonid Mar 20 '19 at 21:28