0

I have a dataframe with 4 columns: NAME, VAL1, VAL2, COST. The COST column has values stored as dictionary.

NAME    VAL1    VAL2     COST
----------------------------------------------------------------------------
BACD    259.35  267.00  {'COST[2]': [256.12], 'COST[5]': [257.72], 'COST[16]': [266.3]}
BACD    256.00  257.95  {'COST[2]': [255.27], 'COST[5]': [257.91], 'COST[16]': [266.3]}
BACD    257.90  262.00  {'COST[2]': [258.15], 'COST[5]': [258.82], 'COST[16]': [266.3]}
BACD    260.10  261.40  {'COST[2]': [260.32], 'COST[5]': [257.75], 'COST[16]': [266.3]}.
BACD    260.45  264.90  {'COST[2]': [261.58], 'COST[5]': [258.83], 'COST[16]': [266.3]}

Need an Output as below where df.COST=='COST[5],

NAME    VAL2     COST
-----------------------
BACD    267.00  257.72
BACD    257.95  257.91
BACD    262.00  258.82
BACD    261.40  257.75
BACD    264.90  258.83

Tried with the below query, but it doesn't work

print(df[(df.NAME=='BACD') & (df.COST=='COST[5]')][['NAME', 'VAL2', 'COST']]) 

what would be the valid query?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • Why do you have a column containing dicts in the first place? Have you considered ["exploding" them into their own columns](/q/38231591/4518341)? And why does one of them have a period at the end? That suggests they're strings, not dicts. – wjandrea Feb 10 '23 at 20:05
  • BTW, welcome to Stack Overflow! Check out the [tour] and [How to ask a good question](/help/how-to-ask) for tips like making a [mre]. For specifics, see [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea Feb 10 '23 at 20:06
  • If it helps, `df.COST=='COST[5]'` is very much not what you want. Take a look at [Extract value from dictionary in dataframe column](/q/65674646/4518341) and [how to replace an entire column on Pandas.DataFrame](/q/36846060/4518341) for something a little closer. – wjandrea Feb 10 '23 at 20:21

1 Answers1

0

I had to reassign the COST column to get the value for COST[5]:

df['COST'] = df.COST.apply(lambda x: x.get('COST[5]'))

Then you can call:

print(df[(df.NAME=='BACD')][['NAME', 'VAL2', 'COST']])