I have data frame(df) consists of 47 columns and 30,000 rows, columns are belows
Index(['Unnamed: 0', 'CtpJobId', 'TransformJobStateId', 'LastError',
'PriorityDate', 'QueuedTime', 'AccurateAsOf', 'SentToDevice',
'StartedAtDevice', 'ProcessStart', 'LastProgressAt', 'ProcessEnd',
'OutputFileDuration', 'Tags', 'SegmentId', 'VideoId',
'ClipFirstFrameNumber', 'ClipLastFrameNumber', 'SourceId',
'SourceNamedLocation', 'SourceDirectory', 'SourceFileSize',
'srcMediaFormat', 'srcFrameRate', 'srcWidth', 'srcHeight', 'srcCodec',
'srcDuration', 'TargetId', 'TargetNamedLocation', 'TargetDirectory',
'TargetFilename', 'Description', 'TargetTags', 'tgtFrameRate',
'tgtDropFrame', 'tgtWidth', 'tgtHeight', 'tgtCodec', 'DeviceType',
'DeviceResourceId', 'AssignedDeviceId', 'DeviceName',
'AssignedDeviceJobId', 'DeviceUri'],
dtype='object')
I want to apply a function for selective column or that data frame to create a new column called df['seg_duration'], so my function is as below
def seq_duration(df):
if ClipFirstFrameNumber is not None and ClipLastFrameNumber is not None:
fn = ClipLastFrameNumber -ClipFirstFrameNumber
if FrameRate =='23.98' and DropFrame == 'False' :
fps = 24 / 1.001
elif FrameRate == '24' and DropFrame == 'False':
fps = 24
elif FrameRate == '25'and DropFrame == 'False':
fps = 25
elif FrameRate == '29.97':
fps = 30 / 1.001
elif FrameRate == '30' and DropFrame == 'False':
fps = 30
elif FrameRate == '59.94':
fps = 60 / 1.001
Duration = fn/fps
elif srcDuration is not None:
Duration = srcDuration
else:
None
The function is actually have 3 case and in one case have many conditions, so first i have subtract the value from ClipLastFrameNumber to ClipFirstframeNumber columns and save it to fn variable. and aplly other logic, same as srcDuration is column and its value. such as below
ClipLastFrameNumber ClipFirstFrameNumber tgtDropFrame tgtFrameRate
NaN NaN True 29.97
NaN NaN True 29.97
NaN NaN True 29.97
34354.0 28892.0 True 29.97
When I apply this function as below
df['seg_duration']=df.apply(seq_duration)
I am getting error NameError: ("name 'ClipFirstFrameNumber' is not defined", 'occurred at index Unnamed: 0')
Is that right way to write function for pandas or how do I use this function to that data frame and achieve my goal to create a new column df['seg_dur'] based on that function. Thanks in advance