JSON to CSV: How to add filters (columns) in the final Excel table?

Question

First, I apologize if my description is not accurate enough for you, I am a total newbie and I don't know a thing about programming, so don't hesitate to tell me if you need more detailed info, but I will try to be as precise as possible.

So I have downloaded a bunch of tweets thanks to Twitter's API and the Terminal (through Twurl). All the tweets are in a .json file (that I open with TextWrangler, I'm on a Mac) and the thing is that when I export my .json file to a .csv file in order to process and analyze the data more easily thanks to Excel (or at least the Excel version of LibreOffice), I don't have all the parameters I would require for my study, I lack the "bio" part of each Tweet info present in the .json file. In other words, in my final table I have a column for the tweet ID, one for the tweet author, one for the text of the tweet itself and so on... But I don't have a column for the bio of the tweet author, whereas this information is displayed in the .json file itself. So my question is: is there a code or anything which would enable me to have one more column displaying some more info present in the basic .json file in my final .csv table?

Again, this may not be clear, so don't hesitate to tell me if you need me to highlight a specific point.

Thanks in advance for any insight, I really need help on this one, this is for a research project I need to carry on for my PhD, so any help would be more than welcome!

EDIT: As an example, here is a sample of the data I have for one tweet in my original .json file:

{
    "created_at": "Mon Apr 28 09:00:40 +0000 2014",
    "id": 460705144846712800,
    "id_str": "460705144846712832",
    "text": "Work can suck a dick today",
    "source": "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
    "truncated": false,
    "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "in_reply_to_user_id_str": null,
    "in_reply_to_screen_name": null,
    "user": {
        "id": 253350311,
        "id_str": "253350311",
        "name": "JEEEZUS",
        "screen_name": "Maxi_Flex",
        "location": "Southchestershire",
        "url": "http://www.soundcloud.com/maxi_flex",
        "description": "Jazz Personality.G Mentality.",
        "protected": false,
        "followers_count": 457,
        "friends_count": 400,
        "listed_count": 1,
        "created_at": "Thu Feb 17 02:08:57 +0000 2011",
        "favourites_count": 1229,
        "utc_offset": null,
        "time_zone": null,
        "geo_enabled": true,
        "verified": false,
        "statuses_count": 13661,
        "lang": "en",
        "contributors_enabled": false,
        "is_translator": false,
        "is_translation_enabled": false,
        "profile_background_color": "08ABFC",
        "profile_background_image_url": "http://pbs.twimg.com/profile_background_images/444297891977244672/Z1BkfCFB.jpeg",
        "profile_background_image_url_https": "https://pbs.twimg.com/profile_background_images/444297891977244672/Z1BkfCFB.jpeg",
        "profile_background_tile": true,
        "profile_image_url": "http://pbs.twimg.com/profile_images/454073282778902529/gCGicDBH_normal.jpeg",
        "profile_image_url_https": "https://pbs.twimg.com/profile_images/454073282778902529/gCGicDBH_normal.jpeg",
            "profile_banner_url": "https://pbs.twimg.com/profile_banners/253350311/1392339276",
        "profile_link_color": "FA05F2",
        "profile_sidebar_border_color": "FFFFFF",
        "profile_sidebar_fill_color": "DDEEF6",
        "profile_text_color": "333333",
        "profile_use_background_image": true,
        "default_profile": false,
        "default_profile_image": false,
        "following": null,
        "follow_request_sent": null,
        "notifications": null
    },
    "geo": null,
    "coordinates": null,
    "place": null,
    "contributors": null,
    "retweet_count": 0,
    "favorite_count": 0,
    "entities": {
        "hashtags": [],
        "symbols": [],
        "urls": [],
        "user_mentions": []
    },
    "favorited": false,
    "retweeted": false,
    "filter_level": "medium",
    "lang": "en"
}

So in the final csv file, I have some of the info I mentionned above, but what I would need to add in the csv file is the "description" part (bold) of each string. Any help would be appreciated!

Thank you for anwering so quickly, so here is an example of a string representing the data I have for one single tweet in my json file: — Michael Gauthier, Apr 28 '14 at 18:10
Ok, I don't have enough room in a comment to post an example of a sample, so I'll edit my original post... — Michael Gauthier, Apr 28 '14 at 18:14
I don't see it did you try to post it into comment? You would need to update your questions — Usman Ismail, Apr 28 '14 at 18:14
Just did it, sorry it took me some time to get used to the interface... ; ) — Michael Gauthier, Apr 28 '14 at 18:18
If you aren't coding a solution, I don't think this is the right forum for you... — pherris, Apr 28 '14 at 18:43
On the contrary... I am not coding, so I came here precisely to seek help from people who do code, and thus who may be able to help me find a solution to my problem... — Michael Gauthier, Apr 28 '14 at 18:45

Usman Ismail · Answer 1 · 2014-04-28T18:56:45.123

Any good JSON to CSV converter will work, try this one. If there is somehting funky in the JSON we need an example of the input JSON and what is getting spit out.

If you just need that one field enter the following command on the command line:

cat test.json | sed -n 's/.*description\":\"\([^"]*\)\".*/Description, \1/p' > result.csv

Where test.json is the file with all the JSON entries in it.

Here is the output from an example I ran:

cat test.json | sed -n 's/.*description\":\"\([^"]*\)\".*/\1/p'
Jazz Personality.G Mentality.
Jazz Personality.G Mentality.
Jazz Personality.G Mentality.
Jazz Personality.G Mentality.

If the file is very large you may need to split in to parts:

split -l N test.json part

Where N is the number of lines per part.

Thanks for the link, I didn't know about it, but the thing is that it won't work apparently because the file is too big... The point is that my file is extremely big (about fifty thousand tweets), hence my need to automatically process it through Excel for example... But thanks a lot for the advice, I really appreciate it! — Michael Gauthier, Apr 28 '14 at 18:20

score 1 · Answer 2 · edited May 23 '17 at 12:33

1

The problem is probably that JSON is hierarchical and CSV is not. I'm guessing that you are only getting the top level JSON elements and not the nested objects. For example if your JSON is:

{
 'name': 'test',
 'author': {
    'id': 123,
    'created': ''
  }
}

you are only getting 'name' and not 'author.id'? If this is the case, check out other questions on SO related to flattening JSON out for CSV e.g. flattening json to csv format

edited May 23 '17 at 12:33

Community

1
1

answered Apr 28 '14 at 18:06

pherris

17,195
8
42
58

Thanks for the answer! Yes, I am getting the author id. The full list of the parameters I have in my final csv file is: "tweet id", "tweet time", "tweet author", "tweet author id" "tweet language" "tweet geo" and "tweet text". I would need to add something like "author description" to have all the data I need. – Michael Gauthier Apr 28 '14 at 18:24
I just took a look at the link you provided me, and thanks by the way, but the thing is that as I said, I am a total newbie, so I don't really understand what is going on in this thread... I don't know most of the technical terms so I am a bit confused... :s – Michael Gauthier Apr 28 '14 at 18:28

JSON to CSV: How to add filters (columns) in the final Excel table?

2 Answers2