I'm using FastAPI with pydantic for schema validation.
In my fastapi application, I'm receiving data from external source in the below mentioned format.
example_list1 = [ ['11','customer1','tran1','item1','itemcd1','discount1','discountcoupon','cash1','cash2'], ['11','customer1','tran1','item1','itemcd1','discount2','discountcoupon','cash1','cash2'], ['11','customer1','tran1','item1','itemcd2','discount1','discountcoupon','cash1','cash2'], ['12','customer1','tran2','item1','itemcd1','discount1','discountcoupon','cash1','cash2'], ['11','customer1','tran1','item1','itemcd1','discount1','discountcoupon2','cash1','cash2'], ['12','customer1','tran2','item1','itemcd1','discount1','discountcoupon2','cash1','cash2'], ['13','customer1','tran3','item1','itemcd1','None','None','cash1','cash2'] ]
The required API response schema should be in below format:
final_output = [
{
"id":"11",
"custname":"customer1",
"tranname":"tran1",
"itemdetail":[
{
"seqno":"item1",
"itemcd":"itemcd1"
},
{
"seqno":"item1",
"itemcd":"itemcd2"
}
],
"discountdetail":[
{
"seqno":"discount1",
"dname":"discountcoupon"
},
{
"seqno":"discount2",
"dname":"discountcoupon"
},
{
"seqno":"discount1",
"dname":"discountcoupon2"
}
],
"tenderdetail":[
{
"seqno":"cash1",
"type":"cash2"
}
]
},
{
"id":"12",
"custname":"customer1",
"tranname":"tran2",
"itemdetail":[
{
"seqno":"item1",
"itemcd":"itemcd1"
}
],
"discountdetail":[
{
"seqno":"discount1",
"dname":"discountcoupon"
},
{
"seqno":"discount1",
"dname":"discountcoupon2"
}
],
"tenderdetail":[
{
"seqno":"cash1",
"type":"cash2"
}
]
},
{
"id":"13",
"custname":"customer1",
"tranname":"tran3",
"itemdetail":[
{
"seqno":"item1",
"itemcd":"itemcd1"
}
],
"discountdetail":[],
"tenderdetail":[
{
"seqno":"cash1",
"type":"cash2"
}
]
}
]
How can I get the expected result in faster and optimized way since the data we are dealing with is huge and cannot use any of the dataframe library or the parquet conversions which will give memory constraint with FastAPI.
Few conditions: tenderdetail and discountdetail will be empty list if there is 'None' and same tranId is not supposed to contain duplicates in itemdetail, tenderdetail and discountdetail.
Can we use multithreading to prepare the API response?