I just joined here after reading a ton of info over the last few months as I get grounds with Python.
Anyway, I'm very new and have been researching as much as possible but most of the answers are a bit out of my reach in understanding and don't seem to do exactly what I need.
From the reading I've done, I'm not sure if I should familiarize myself with Panda or not, but I basically need to do simple formatting, conversion and re-organization of an ALE file. An ALE is a simple tab-delimited list file that contains video clip names and metadata. The headers are located on row 8 and content data on 11 and down. Here's an example:
1 Heading
2 FIELD_DELIM TABS
3 VIDEO_FORMAT 1080
4 AUDIO_FORMAT 48khz
5 FPS 23.976
6
7 Column
8 #### COLUMN HEADERS ####
9
10 Data
11 #### TAB DELIMITED DATA ####
For now, we'll assume my input files have been preformatted to strip rows 1-7, 9 and 10, so we just have a header row as row 1, and data starts on row 2.
My first task with this program is to convert an entire column of data into a new format, which I have working correctly, but only if I target the column specifically that I am looking for in a data set that has no headings.
for row in ale_file:
row[3] = timecode_to_frames(row[3])
print row
The problem is, I don't always know what column numbers the data exists in (as each program will output the metadata in different orders) but I do know what the header name is. Somehow I need to read the header row, and when it finds the three headers named "start", "end", and "duration", it will pass those column numbers to a variable. Then, in the for loop above, I would be able to run my timecode_to_frames function on the row numbers that match the headers.
I feel this should be fairly simple along these lines (forgive me if I'm horribly off):
for row in ale_file:
for col in row:
if col == 'start':
start_col = ##column number##
Then in my existing code I could call the variable in:
for row in ale_file:
row[start_col] = timecode_to_frames(row[start_col])
print row
Side note: In my FOR loop, do I need to explicitly skip row 1 since it's just a header, as it won't have the properly formatted data the function is expecting. Perhaps nest the for loop in a while loop like while row != 0:
or something?
Any help would be greatly appreciated, thanks!