I have a large dataframe consisting originally of forecasted data of overall application and then have added columns of forecasted data based on how much percentage traffic each service gets from the overall traffic.
PeakHourForecast address_Forecast arc_Forecast auth_Forecast \
0 747.787093 186.946773 186.946773 411.282901
1 691.159730 172.789933 172.789933 380.137852
2 655.040498 163.760124 163.760124 360.272274
3 630.850889 157.712722 157.712722 346.967989
4 619.764089 154.941022 154.941022 340.870249
... ... ... ... ...
42403 1097.177031 274.294258 274.294258 603.447367
42404 1060.533763 265.133441 265.133441 583.293570
42405 1024.620098 256.155024 256.155024 563.541054
42406 961.448085 240.362021 240.362021 528.796447
42407 875.026753 218.756688 218.756688 481.264714
authreversal_Forecast bill_Forecast credit_Forecast \
0 74.778709 269.203353 74.778709
1 69.115973 248.817503 69.115973
2 65.504050 235.814579 65.504050
3 63.085089 227.106320 63.085089
4 61.976409 223.115072 61.976409
... ... ... ...
42403 109.717703 394.983731 109.717703
42404 106.053376 381.792155 106.053376
42405 102.462010 368.863235 102.462010
42406 96.144809 346.121311 96.144809
42407 87.502675 315.009631 87.502675
Based on this I have secondary columns for each service which are True or False if the forecast for that service is above it's current capacity. However due to the size of the volume printing out the dataframe only shows a small amount of the rows and columns. Some components may have risk as False for most rows but will have spots where they are true and i am not seeing those in the print.
I had been trying to see risk level of each service by simply filtering like data2.filter(like='Risk')
which gives below
RiskPresent address_RiskPresent arc_RiskPresent auth_RiskPresent \
0 False False True False
1 False False True False
2 False False True False
3 False False True False
4 False False True False
... ... ... ... ...
42403 False False True False
42404 False False True False
42405 False False True False
42406 False False True False
42407 False False True False
authreversal_RiskPresent bill_RiskPresent credit_RiskPresent \
0 False False False
1 False False False
2 False False False
3 False False False
4 False False False
... ... ... ...
42403 False False False
42404 False False False
42405 False False False
42406 False False False
42407 False False False
As we can see there is arc_RiskPresent where the values are basically all True. However in looking through an outputted excel file I can see there are risk = True values in other columns here and there. So how can i find all rows which have True in them for every _RiskPresent column? Ideally i would like to then be able to tie each _RiskPresent = True row to the _Forecast row for that component as well.
I have been searching for this but all the results are very basic and havent been very helpful. The closest help i have seen is to do something like below but that isnt getting me very far and has these odd NaN rows which i dont see in excel output file.
a = data2.filter(like='_RiskPresent').apply(lambda row: row[row==True], axis=1)
print(a)
arc_RiskPresent fingerprint_RiskPresent giftcardservice_RiskPresent \
0 True NaN NaN
1 True NaN NaN
2 True NaN NaN
3 True NaN NaN
4 True NaN NaN
... ... ... ...
42403 True NaN True
42404 True NaN True
42405 True NaN True
42406 True NaN True
42407 True NaN True
paypalservice_RiskPresent
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
... ...
42403 NaN
42404 NaN
42405 NaN
42406 NaN
42407 NaN
However doing print(a.all())
seems to at least give me each column name which has True in it somewhere, but i'm not sure if this is actually 100% of them nor does it help me identify where in the forecasted data we going over capacity so I cannot identify how much over it is.
arc_RiskPresent True
fingerprint_RiskPresent True
giftcardservice_RiskPresent True
paypalservice_RiskPresent True
dtype: bool