1
 This is my DataFrame
                 Date         Value
0  "date": "1999-01-01  "s1":3.0000}
1  "date": "1999-01-02  "s1":3.0000}
2  "date": "1999-01-03  "s1":3.0000}
3  "date": "1999-01-04  "s1":3.0000}
4  "date": "1999-01-05  "s1":3.0000}

I want this DataFrame to be Transformed like this 

    Date             Value
    1999-01-01        3
    1999-01-02        3
    1999-01-03        3
    1999-01-04        3
    1999-01-05        3
    1999-01-06        3

I tried this

cols = ['Date', 'Value']
for col in cols:
    DataAll[col] = DataAll[col].map(lambda x: str(x).lstrip('{}').rstrip('"date:")({)(:)(s1)(})'))

Please if anyone has a solution for this Please help. I have tied mny times for the solution and i didnt get any solution that has pure solution.

Pavan Suvarna
  • 486
  • 3
  • 13

2 Answers2

2

You can chain text methods first for strip {}, then split by :, select second lists and last remove trailing " and spaces:

cols = ['Date', 'Value']
f = lambda x: x.astype(str).str.strip('{}').str.split(':').str[1].str.strip(' "')
DataAll[cols] = DataAll[cols].apply(f)

print (DataAll)
         Date   Value
0  1999-01-01  3.0000
1  1999-01-02  3.0000
2  1999-01-03  3.0000
3  1999-01-04  3.0000
4  1999-01-05  3.0000

If jsons in column then first convert values to dictionaries in list comprehension and pass to DataFrame constructor:

print (DataAll)
                             json_col
0  {"date": "1999-01-01","s1":3.0000}
1  {"date": "1999-01-02","s1":3.0000}
2  {"date": "1999-01-03","s1":3.0000}
3  {"date": "1999-01-04","s1":3.0000}
4  {"date": "1999-01-05","s1":3.0000}

import ast

DataAll1 = pd.DataFrame([ast.literal_eval(x) for x in DataAll['json_col']])
print (DataAll1)
         date   s1
0  1999-01-01  3.0
1  1999-01-02  3.0
2  1999-01-03  3.0
3  1999-01-04  3.0
4  1999-01-05  3.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

You can just find the string between ':' and '.' as below

import numpy as np
import pandas as pd

pan = pd.DataFrame({'date': ["1999-01-01", "1999-01-02","1999-01-03","1999-01-04","1999-01-05"], 'Value': ['"s1":3.0000', '"s1":3.0000', '"s1":3.0000', '"s1":3.0000', '"s1":3.0000']})

def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

for index, row in pan.iterrows():
    print(row['date'],find_between(row['Value'], ':', '.'))

find_between function will return the string between : and .

Function found at Find string between two substrings

Viswanath Polaki
  • 1,357
  • 1
  • 10
  • 19