I can not use any jsondiff library. I require to do this without library.
default_config = {
"pipeline_id": "2",
"version": 1.0,
"tasks": [
{
"task_group_id": "Task_group_1",
"branch": [
{
"task_id": "Task_Name_1",
"code_file_path": "tasks/base_creation/final_base_logic.hql",
"language": "hive",
"config": {
"k1": "v1"
},
"sequence": 1,
"condition": "in_start_date in range [2021-10-01 , 2023-11-04]"
}
],
"default": {
"task_id": "Task_group_1_default",
"code_file_path": "tasks/base_creation/default_base_logic.hql",
"language": "hive",
"config": {}
}
},
{
"task_group_id": "Task_group_2",
"branch": [
{
"task_id": "Task_Name_2",
"code_file_path": "tasks/variables_creation/final_cas_logic.py",
"language": "pyspark",
"config": {
"k2": "v2"
},
"sequence": 1,
"condition": "in_start_date in range [2022-02-01 , 2023-11-04]"
},
{
"task_id": "Task_Name_3",
"code_file_path": "tasks/variables_creation/final_sor_logic.py",
"language": "pyspark",
"config": {
"k3": "v3"
},
"sequence": 2,
"condition": "in_start_date in range [2021-10-01 , 2022-01-31]"
}
],
"default": {
"task_id": "Task_group_2_default",
"code_file_path": "tasks/variables_creation/default_variables_logic.py",
"language": "pyspark",
"config": {}
}
}
],
"dependencies": " ['task_group_id_01_Name >> task_group_id_02_Name']"
}
update_config = {
"pipeline_id": "2",
"version": 1.0,
"tasks": [
{
"task_group_id": "Task_group_1",
"branch": [
{
"task_id": "Task_Name_1",
"code_file_path": "tasks/base_creation/final_base_logic.hql",
"language": "hive",
"config": {
"A1": "B1"
},
"sequence": 1,
"condition": "in_start_date in range [2021-10-01 , 2023-11-04]"
}
],
"default": {
"task_id": "Task_group_1_default",
"code_file_path": "tasks/base_creation/default_base_logic.hql",
"language": "hive",
"config": {}
}
},
{
"task_group_id": "Task_group_2",
"branch": [
{
"task_id": "Task_Name_2",
"code_file_path": "tasks/variables_creation/final_cas_logic.py",
"language": "pyspark",
"config": {
"k2": "v2",
"kq1":"kw1"
},
"sequence": 1,
"condition": "in_start_date in range [2022-02-01 , 2023-11-04]"
}
],
"default": {
"task_id": "Task_group_2_default",
"code_file_path": "tasks/variables_creation/default_variables_logic.py",
"language": "pyspark",
"config": {}
}
}
],
"dependencies": " ['task_group_id_01_Name >> task_group_id_02_Name']"
}
when we compared updated_config with default_config. 3 things are happening.
- For Task_group_1 task_Name_1 config was changed.
- For Task_group_2 Task_Name_2 there were 2 config. and 2nd one got added in updated_config "kq1":"kw1"
- For task_group_2 Task_Name_3 got deleted
So I only need that info to create a json like this:
api = {
"pipelineId": 2,
"pipelineVersion": 1,
"taskConfigurations": [
{
"taskName": "Task_Name_1",
"configKey": "A1",
"configValue": "B1"
},
{
"taskName": "Task_Name_2",
"configKey": "kq1",
"configValue": "kw1"
}
]
}
In the above api json you can see that we have only updated info in that. By updated I mean when comparing updated_config with default_config
Please help me out. How to make thing changes using python