I have a sample data look like this (real dataset has more columns):
data = {'stringID':['AB CD Efdadasfd','RFDS EDSfdsadf dsa','FDSADFDSADFFDSA'],'IDct':[1,3,4]}
data = pd.DataFrame(data)
data['Index1'] = [[3,6],[7,9],[5,6]]
data['Index2'] = [[4,8],[10,13],[8,9]]
What i want to achieve is i want to slice stringID column based on second elment in Index1 and Index2 (both are list), only if IDct value is bigger than 1, otherwise return NaN.
I tried this, it works as Output1 column, but there must be a better way (i mean faster when apply to a large dataset) to do it, please kindly advise, thanks!
data['pos'] = data.Index1.map(lambda x: x[1])
data['pos1'] = data.Index2.map(lambda x: x[1])
def cal(m):
if m['IDct'] > 1:
return m['stringID'][m['pos']:m['pos1']]
else:
return 'NaN'
data['Output1'] = data.apply(cal,axis=1)