So I've been trying to mess around with a json for the first time which I'm pulling from the kanka.io API. I'm trying to remove any indexes between 'entry' and either 'section' or 'entry_parsed' so I can determine if an ID pertains to a character or an attribute and append only the character names to a list.
I've shortened the list itself which I turned the json into, for the sake of testing in python tutor's live programming mode.
# Request data from URL
response = requests.request("GET", url, headers=headers, data=payload)
# Open data
rtext=response.text
# Clean data
punct = ['{','}','[',']','\"',':',',']
rt = ""
for item in rtext:
if item in punct:
rt+=str(' ')
else:
rt+=str(item)
# Itemize string of text
rsplit = rt.split()
#rsplit = [
#'id', '260405', 'name', 'Frank', 'Burns', 'entry', 'null', 'entry_parsed', 'traits',
#'id', '260406', 'name', 'Henry', 'Blake', 'entry', 'null', 'entry_parsed', 'null', 'image', 'null',
#'id', '260407', 'name', 'Margret', 'Houlihan', 'entry', 'null', 'entry_parsed', 'null', 'image', 'true', 'is_private', 'true',
#'id', '260408', 'name', 'John', 'MacInyre', 'entry', '\\n<p>Graduate', 'of', 'Darthmouth.<\\/p>\\n<p>\\u00a0<\\/p>\\n', 'entry_parsed',
#'id', '260409', 'name', 'Walter', 'O\'Reilly', 'entry', 'null', 'entry_parsed', 'null', 'image', 'image_full', 'https',
#'id', '260410', 'name', 'Benjiam', 'Franklin', 'Pierce', 'entry', 'null', 'entry_parsed', 'null', 'image', 'image_full', 'https',
#'id', '165148', 'name', 'Eyes', 'entry', 'Blue', 'section', 'appearance', 'is_private', 'false', 'default_order', '1',
#'id', '260411', 'name', 'Francis', 'Mulcahy', 'entry', 'null', 'entry_parsed', 'null',
#]
#########
# NAMES #
#########
# Append character names into list
this1=0
# Cycle throught all the words
while this1 < len(rsplit):
next1 = this1+1
last1 = this1-1
# Stop at the first element after 'name'
if rsplit[last1] == "name":
# Read and concatenate elements until the element 'entry'
while rsplit[next1] != "entry":
nextword = rsplit[next1]
rsplit[this1]+='_'+nextword
# Remove redundant elements by replacing next with last
rsplit[next1]=rsplit[this1]
rsplit.remove(rsplit[this1])
# Remove words inbetween entry and (entry_parsed or section)
if rsplit[this1] == "entry":
while rsplit[next1] != ("entry_parsed" or "section"):
rsplit.remove(rsplit[descWord])
print(rsplit[this1:next1+4])
this1+=1
What I would want it to print from the printline is
['Frank_Burns', 'entry', 'entry_parsed', 'traits']
['Henry_Blake', 'entry', 'entry_parsed', 'null']
['Margret_Houlihan', 'entry', 'entry_parsed', 'null']
['John_MacInyre', 'entry','entry_parsed']
["Walter_O'Reilly", 'entry', 'entry_parsed', 'null']
['Benjiam_Franklin_Pierce', 'entry', 'entry_parsed', 'null']
['Eyes', 'entry', 'section', 'appearance']
['Francis_Mulcahy', 'entry', 'entry_parsed', 'null']
I've tried different variations where the index after entry is == this1, last1, next1, and none of them are actually removing the index object between 'entry' and 'entry_parsed' or 'section'. I've also tried
if rsplit[this1] == "entry":
while not rsplit[next1] == "entry_parsed" or "section":
and it still keeps printing out 'null' or 'Blue', etc.