0

I have a string as follows:

{
"getYearsListOverview": {
    "sp_name": "analytics.year_overview_drop_down",
    "sp_input_params": {
        "req_url_query_params": [],
        "req_body_params": []
    },
    "sp_output_datasets": [],
    "page_name": "home"
},
"getRankingsDataPerformanceReport": {
    "sp_name": "analytics.get_performance_ranking_data",
    "sp_input_params": {
        "req_url_query_params": [
            ["@scroll_index", "index"]
        ],
        "req_body_params": [
            ["@event_type_id", "event_type_id"],
            ["@season", "season"],
            ["@athlete_guid", "athlete_guid"]
        ]
    },
    "sp_output_datasets": [],
    "number_of_output_datasets_for_customized_template": 4,
    "customised_response_template": {
        "performance_value_list": [],
        "rankings_table": [],
        "level_values": []
    },
    "page_name": "performancereport"
  }
}

I want to read the strings after sp_name, req_url_query_params,req_body_params and finally page_name fields and put them inside a list.

E.g.

sp_name = ['analytics.year_overview_drop_down','analytics.get_performance_ranking_data']
req_url_query_params = ['','@scroll_index, index']
req_body_params = ['','@event_type_id, "event_type_id,@season,season,@athlete_guid,athlete_guid']

It seems I have to use regex and then re.findall(). But I am not quite proficient in the same. Glad if someone can share the search string.

Pseudo Code:

s = re.findall(r'sp_name(\w+)*',original_string)

Will the above code work?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
pythondumb
  • 1,187
  • 1
  • 15
  • 30

2 Answers2

0

I utilise sum(l, []) to flatten a list but you can anyway you wish.

import json

my_string = """
{
"getYearsListOverview": {
    "sp_name": "analytics.year_overview_drop_down",
    "sp_input_params": {
        "req_url_query_params": [],
        "req_body_params": []
    },
    "sp_output_datasets": [],
    "page_name": "home"
},
"getRankingsDataPerformanceReport": {
    "sp_name": "analytics.get_performance_ranking_data",
    "sp_input_params": {
        "req_url_query_params": [
            ["@scroll_index", "index"]
        ],
        "req_body_params": [
            ["@event_type_id", "event_type_id"],
            ["@season", "season"],
            ["@athlete_guid", "athlete_guid"]
        ]
    },
    "sp_output_datasets": [],
    "number_of_output_datasets_for_customized_template": 4,
    "customised_response_template": {
        "performance_value_list": [],
        "rankings_table": [],
        "level_values": []
    },
    "page_name": "performancereport"
  }
}"""

d = json.loads(my_string)

Once you have this there is still an infinity of ways to extract the data, 1 of the more sane:

sp_name = [v["sp_name"] for v in d.values()]
req_url_query_params = [", ".join(i for l in v["sp_input_params"]["req_url_query_params"] for i in l) for v in d.values()]
req_body_params = [",".join(i for l in v["sp_input_params"]["req_body_params"] for i in l) for v in d.values()]
print(sp_name)
print(req_url_query_params)
print(req_body_params)
Vulwsztyn
  • 2,140
  • 1
  • 12
  • 20
0

Using list comprehension with the parsed json document.

from itertools import chain
import json

text = """{
"getYearsListOverview": {
    "sp_name": "analytics.year_overview_drop_down",
    "sp_input_params": {
        "req_url_query_params": [],
        "req_body_params": []
    },
    "sp_output_datasets": [],
    "page_name": "home"
},
"getRankingsDataPerformanceReport": {
    "sp_name": "analytics.get_performance_ranking_data",
    "sp_input_params": {
        "req_url_query_params": [
            ["@scroll_index", "index"]
        ],
        "req_body_params": [
            ["@event_type_id", "event_type_id"],
            ["@season", "season"],
            ["@athlete_guid", "athlete_guid"]
        ]
    },
    "sp_output_datasets": [],
    "number_of_output_datasets_for_customized_template": 4,
    "customised_response_template": {
        "performance_value_list": [],
        "rankings_table": [],
        "level_values": []
    },
    "page_name": "performancereport"
  }
}"""

data = json.loads(text)

sp_name = [value['sp_name'] for value in data.values()]
req_url_query_params = [
    ", ".join(chain.from_iterable(value['sp_input_params']['req_url_query_params']))  # Or as @Olvin suggested: ", ".join(item for sublist in value["sp_input_params"]["req_url_query_params"] for item in sublist)
    for value in data.values()
]
req_body_params = [
    ", ".join(chain.from_iterable(value['sp_input_params']['req_body_params']))  # Or as @Olvin suggested: ", ".join(item for sublist in value["sp_input_params"]["req_body_params"] for item in sublist)
    for value in data.values()
]

print(sp_name)
print(req_url_query_params)
print(req_body_params)
['analytics.year_overview_drop_down', 'analytics.get_performance_ranking_data']
['', '@scroll_index, index']
['', '@event_type_id, event_type_id, @season, season, @athlete_guid, athlete_guid']
  • You can replace `chain.from_iterable` with nested loop: `", ".join(i for l in value["sp_input_params"]["req_url_query_params"] for i in l)` – Olvin Roght Sep 02 '21 at 09:00
  • I assume that will produce different strings for each sublist? I'm not sure as the question wants to have the joined string for all sublists `'@event_type_id, "event_type_id,@season,season,@athlete_guid,athlete_guid'`. Or maybe the expected output in the question was just written by mistake :) – Niel Godfrey Pablo Ponciano Sep 02 '21 at 09:02
  • It will produce exactly the same with your solution. You can try `[", ".join(i for l in v["sp_input_params"]["req_url_query_params"] for i in l) for v in data.values()]` – Olvin Roght Sep 02 '21 at 09:03
  • Ahh! I see. The inner list comprehension. Would could also use that. Thank you @Olvin – Niel Godfrey Pablo Ponciano Sep 02 '21 at 09:07