3

I am using a file from Insideairbnb.com for my thesis. It is a csv.gz file so first I extracted it using the 'Archive Utility' for Mac.

It is comma delimited and uses double quotes as the text qualifier which I specified in the Import popup but Excel/SPSS is still delimitating at the commas within the text.

It is a large file that includes full airbnb descriptions and reviews which are contained in double quotations. Unfortunately, there are many commas within the strings of text. I have never seen a csv file with this format but I believe it was put together correctly because I have seen Insideairbnb cited for data in quite a few scholarly articles.

I have included a link to pictures of a snippet of the data on the SPSS import window. If anyone knows how to go about importing this I would greatly appreciate your feedback :)

Thank you in advance!

[[1]: https://i.stack.imgur.com/Iy3dA.png][1][SPSS screenshot] [1]: https://i.stack.imgur.com/i7KcG.png[SPSS screenshot 2][1]

sara white
  • 31
  • 1
  • 3
  • I'm not sure the screenshot has attached properly. Or at least, I don't see it. Can you describe a little more what you've tried so far, and what error you're getting? Does it refuse to import at all, or is the resulting data not separated the way it should be? – bjmc Nov 03 '18 at 00:09
  • Thank you for letting me know! I re-attached the image links. – sara white Nov 03 '18 at 00:48
  • It imports the data but the text from the reviews separates into a new cell after each comma. There are also many empty rows and cells with missing data. I think this might be because on the original CSV it is organised in something like mini paragraphs with page breaks. – sara white Nov 03 '18 at 00:54
  • I am having the same problem with data from Insideairbnb.com, also for my thesis. Could you share your solution if possible please? – Nancy Collins Jun 25 '20 at 14:40

2 Answers2

0

I agree with @sarawhite's comment above; if this is a one-time problem there are a couple things I would try.

  1. open the .csv in excel, and if it looks right, save it and then try to import it in SPSS, or saveas an .xlsx file and import that (although there can be nonsense with string variables in either scenario)

OR

  1. open in notepad++ and look at the raw data. you can find and replace double line breaks fairly easily.
JYurkovich
  • 127
  • 8
0

I copy-pasted the data into Notepad++ yesterday, then converted it to ANSI and copy-pasted it back into Excel. Yesterday, it worked, but today it doesn't...

Anyways, maybe this thread is helpful for people with the same question. I will try again at a later point in time.

Rianne
  • 1
  • 1