0

I am trying to adjust the json to csv parses I found here on GitHub. The code is setup to run from terminal with 3 arguments defined: node, path to json file, path to csv to create

I am trying to modified the code so that I can call it to run from another python script that I am writing. From what I learned about modules that run from Terminal they use if __name__ == "__main__": but if I want to run it from another python script I need to create a definition like def main() to call, right?

import sys
import json
import csv

# https://github.com/vinay20045/json-to-csv
##
# Convert to string keeping encoding in mind...
##


def to_string(s):
    try:
        return str(s)
    except:
        # Change the encoding type if needed
        return s.encode('utf-8')

def reduce_item(key, value):
    global reduced_item

    # Reduction Condition 1
    if type(value) is list:
        i = 0
        for sub_item in value:
            reduce_item(key + '_' + to_string(i), sub_item)
            i = i + 1

    # Reduction Condition 2
    elif type(value) is dict:
        sub_keys = value.keys()
        for sub_key in sub_keys:
            reduce_item(key + '_' + to_string(sub_key), value[sub_key])

    # Base Condition
    else:
        reduced_item[to_string(key)] = to_string(value)

# the module I created and moved the contents of __main__ to here
def main(node, json_file_path, csv_file_path):
    # Reading arguments
    # node = sys.argv[1]
    # json_file_path = sys.argv[2]
    # csv_file_path = sys.argv[3]

    fp = open(json_file_path, 'r')
    json_value = fp.read()
    raw_data = json.loads(json_value)
    print(raw_data['tag'])

    try:
        data_to_be_processed = raw_data[node]
    except:
        data_to_be_processed = raw_data

    processed_data = []
    header = []
    for item in data_to_be_processed:
        reduced_item = {}
        reduce_item(node, item)

        header += reduced_item.keys()

        processed_data.append(reduced_item)

    header = list(set(header))
    header.sort()

    with open(csv_file_path, 'a') as f:
        writer = csv.DictWriter(f, header, quoting=csv.QUOTE_ALL)
        writer.writeheader()
        for row in processed_data:
            writer.writerow(row)

    print ("Just completed writing csv file with %d columns" % len(header))


# if __name__ == "__main__":
#     if len(sys.argv) != 4:
#         print ("\nUsage: python json_to_csv.py <node_name> <json_in_file_path> <csv_out_file_path>\n")
#     else:
#         # Reading arguments
#     main(sys.argv)

Here is the other python script I am using to call jsontocsv2.py:

import jsontocsv2
import json

filename = 'test2.csv'

SourceFile = 'carapi.json'

jsontocsv2.main('cars', SourceFile, filename)

Here are the errors I'm getting:

Traceback (most recent call last):
  File "/Users/Documents/Projects/test.py", line 8, in <module>
    jsontocsv2.main('cars', SourceFile, filename)
  File "/Users/Documents/Projects/jsontocsv2.py", line 84, in main
    reduce_item(node, item)
  File "/Users/Documents/Projects/jsontocsv2.py", line 57, in reduce_item
    reduce_item(key + '_' + to_string(sub_key), value[sub_key])
  File "/Users/Documents/Projects/jsontocsv2.py", line 61, in reduce_item
    reduced_item[to_string(key)] = to_string(value)
NameError: name 'reduced_item' is not defined

Can anyone help point in the right direction for how to fix this? I did a lot of searching on the stack overflow and found posts with similar issues, but I have not been able to figure out how to get this to work.

martineau
  • 119,623
  • 25
  • 170
  • 301
Marcel
  • 45
  • 4
  • 4
    Defining things as `global` is generally not a great idea. Try getting rid of `global reduced_item`, change the function definition to `def reduce_item(key, value, reduced_item):` and call it from inside `main()` with `reduced_item = reduce_item(node, item, reduced_item)` instead of `reduce_item(node, item)`. There's quite a bit of code to go through here so not sure if that alone will work. – roganjosh Dec 29 '17 at 18:58
  • Does my suggestion work? I'd rather not speculate on your json and try build a test myself from scratch, but I've got no feedback on where my suggestion got you. – roganjosh Dec 29 '17 at 19:46
  • 1
    Probably not relevant to the `Namerror` issue, but the GitHub page says json_to_csv.py was written for Python 2.7 -- which may lead to (other) problems if you're using version 3. – martineau Dec 29 '17 at 20:21
  • @martineau looking at that library, I assume it was a personal throw-away script that was supposed to be called in isolation and the OP just stumbled upon it. I now think it's probably more hassle than it's worth to try import this code instead of making your own :) – roganjosh Dec 29 '17 at 20:25
  • @roganjosh: Thanks, didn't realize that. Anyhow, it looks like the problem is exactly what it says, `name 'reduced_item' is not defined` anywhere. Declaring it `global` doesn't define it. Confusing matters is the fact that there **is** a function named `reduce_item()` defined which contains calls to both `reduced_item()` **and** to itself. Maybe that has something to do with the problem. – martineau Dec 29 '17 at 20:33
  • @martineau It's only a guess. The `NameError`, I guess, comes from the fact that the OP tried to adapt some code that they found online that was never meant to be imported, it's just a random repo of someone's work. The use of `global` then makes a mess of things. The OP, in the question, just stumbled on this code. Personally, I'd just write my own rather than trying to adapt. This code does nothing special IMO so adapting it is more difficult. – roganjosh Dec 29 '17 at 20:43
  • @roganjosh @martineau thank you for the help! For reference I am using Python3 and the original script works in Python3 from the terminal. I tried your suggestions @roganjosh and it led to a new error that I wasn't defining reduce_item when calling reduce_item. @roganjosh I wish knew how to make my own code to do this. The easiest thing for me was to try and edit this one. I am still puzzled why the script works in the terminal, but why my edited version does not... all I did was create a `def main` in place of 'if __name__' – Marcel Dec 29 '17 at 20:56

1 Answers1

0

I was able to get the code to function just how I wanted.

All I had to do was move the global reduced_item() statement from the def reduced_item() function to def main(node, json_file_path, csv_file_path) function I created. If stating that its a global variable doesn't define, it then I'm not sure why this worked.

Also why is defining something Global generally not a great idea? If you guys have a recommendations for how to do this better I'm open for guidance. Thank you for trying to help.

Marcel
  • 45
  • 4
  • The reason moving the `global reduced_item` statement helped was because it causes `reduced_item` to be a module-level "global" variable instead of a variable local to the `main()` function, as it was before the addition. That allows references to that name in the other function, `reduce_item()`, to now be resolved correctly (whether or not you also declare it as a `global` in that function, too, or not). The is due to the way Python resolves names, The this [answer](https://stackoverflow.com/a/292502/355230) to a different question for a good description of how Python's scoping rules work. – martineau Dec 30 '17 at 23:18