0

I can not use any jsondiff library. I require to do this without library.

    default_config = {
    "pipeline_id": "2",
    "version": 1.0,
    "tasks": [
        {
            "task_group_id": "Task_group_1",
            "branch": [
                {
                    "task_id": "Task_Name_1",
                    "code_file_path": "tasks/base_creation/final_base_logic.hql",
                    "language": "hive",
                    "config": {
                        "k1": "v1"
                    },
                    "sequence": 1,
                    "condition": "in_start_date in range [2021-10-01 , 2023-11-04]"
                }
            ],
            "default": {
                "task_id": "Task_group_1_default",
                "code_file_path": "tasks/base_creation/default_base_logic.hql",
                "language": "hive",
                "config": {}
            }
        },
        {
            "task_group_id": "Task_group_2",
            "branch": [
                {
                    "task_id": "Task_Name_2",
                    "code_file_path": "tasks/variables_creation/final_cas_logic.py",
                    "language": "pyspark",
                    "config": {
                        "k2": "v2"
                    },
                    "sequence": 1,
                    "condition": "in_start_date in range [2022-02-01 , 2023-11-04]"
                },
                {
                    "task_id": "Task_Name_3",
                    "code_file_path": "tasks/variables_creation/final_sor_logic.py",
                    "language": "pyspark",
                    "config": {
                        "k3": "v3"
                    },
                    "sequence": 2,
                    "condition": "in_start_date in range [2021-10-01 , 2022-01-31]"
                }
            ],
            "default": {
                "task_id": "Task_group_2_default",
                "code_file_path": "tasks/variables_creation/default_variables_logic.py",
                "language": "pyspark",
                "config": {}
            }
        }
    ],
    "dependencies": " ['task_group_id_01_Name >> task_group_id_02_Name']"
}



 update_config = {
    "pipeline_id": "2",
    "version": 1.0,
    "tasks": [
        {
            "task_group_id": "Task_group_1",
            "branch": [
                {
                    "task_id": "Task_Name_1",
                    "code_file_path": "tasks/base_creation/final_base_logic.hql",
                    "language": "hive",
                    "config": {
                        "A1": "B1"
                    },
                    "sequence": 1,
                    "condition": "in_start_date in range [2021-10-01 , 2023-11-04]"
                }
            ],
            "default": {
                "task_id": "Task_group_1_default",
                "code_file_path": "tasks/base_creation/default_base_logic.hql",
                "language": "hive",
                "config": {}
            }
        },
        {
            "task_group_id": "Task_group_2",
            "branch": [
                {
                    "task_id": "Task_Name_2",
                    "code_file_path": "tasks/variables_creation/final_cas_logic.py",
                    "language": "pyspark",
                    "config": {
                        "k2": "v2",
                        "kq1":"kw1"
                    },
                    "sequence": 1,
                    "condition": "in_start_date in range [2022-02-01 , 2023-11-04]"
                }
            ],
            "default": {
                "task_id": "Task_group_2_default",
                "code_file_path": "tasks/variables_creation/default_variables_logic.py",
                "language": "pyspark",
                "config": {}
            }
        }
    ],
    "dependencies": " ['task_group_id_01_Name >> task_group_id_02_Name']"
}

when we compared updated_config with default_config. 3 things are happening.

  1. For Task_group_1 task_Name_1 config was changed.
  2. For Task_group_2 Task_Name_2 there were 2 config. and 2nd one got added in updated_config "kq1":"kw1"
  3. For task_group_2 Task_Name_3 got deleted

So I only need that info to create a json like this:

api = {
  "pipelineId": 2,
  "pipelineVersion": 1,
  "taskConfigurations": [
    {
      "taskName": "Task_Name_1",
      "configKey": "A1",
      "configValue": "B1"
    },
    {
      "taskName": "Task_Name_2",
      "configKey": "kq1",
      "configValue": "kw1"
    }
  ]
}

In the above api json you can see that we have only updated info in that. By updated I mean when comparing updated_config with default_config

Please help me out. How to make thing changes using python

Adriaan
  • 17,741
  • 7
  • 42
  • 75
Professor
  • 87
  • 6
  • Please do not add answers to the question body itself. Instead, you should add it as an answer. [Answering your own question is allowed and even encouraged](https://stackoverflow.com/help/self-answer). Given that this question is closed, either [edit] it to show why the proposed duplicate does not answer this question and/or post your answer on the duplicate target instead. – Adriaan Feb 21 '23 at 12:16

0 Answers0