First: this probably seems like it's already been answered, there are similar problems described and answered but (I think) this is substantively different enough to merit asking on here (sorry if I'm wrong). That is why I'm writing a fairly detailed explanation below, sorry to be long winded, I'd rather be too detailed.
I'm trying to process large numbers of .txt files, and on each one go thru, find every instance of a targeted word, and then print the word and 10 words on either side of it into a .csv file for analysis (getting a feel for the context the words are used in).
I want the individual words to each land in their own cell for later analyses. As such, in the .csv handling portion, I have it record a descending list of single indices to the key word, and then single indices ascending away from it, 10 in each direction. Works like charm unless the word I'm targeting is within 10 indici from the start or the end of the document.
If it is, it kicks "IndexError: list index out of range"
I've seen helpful explanations on here for managing how to do this building an index list / interfacing with overrunning Indexing (Python Loop: List Index Out of Range) but my problem is that I need (well, I'd like / hope I'm able) to keep the program requesting the indexing and returning ' ' if it's the beginning or end of the file instead of running into a wall.
* For brevity's sake, here is the chunk of code setting up the indexing and then doing the index querying, they're not actually stacked like this in the code. The parentheticals here may be off by a space, don't think that's pertinent but thought I'd approximate in case I am, as usual, wrong. *
for index in range(len(up_file_split_raw)):
if keyword.match(up_file_split_raw[index]):
start = max(0, index-assoc_wrd_range)
finish = min(len(up_file_split_raw), index+assoc_wrd_range+1)
assocd_wrd_list = string.join (up_file_split_raw[start:finish])
Break in Code
row_vals_2 = {
'Assoc_1':(up_file_split_raw[start:index][0]),
'Assoc_2':(up_file_split_raw[start:index][1]),
'Assoc_3':(up_file_split_raw[start:index][2]),
'Assoc_4':(up_file_split_raw[start:index][3]),
'Assoc_5':(up_file_split_raw[start:index][4]),
'Assoc_6':(up_file_split_raw[start:index][5]),
'Assoc_7':(up_file_split_raw[start:index][6]),
'Assoc_8':(up_file_split_raw[start:index][7]),
'Assoc_9':(up_file_split_raw[start:index][8]),
'Assoc_10':(up_file_split_raw[start:index][9]),
'KeyWord':(up_file_split_raw[index]),
'Assoc_11':(up_file_split_raw[index+1:finish][0]),
'Assoc_12':(up_file_split_raw[index+1:finish][1]),
'Assoc_13':(up_file_split_raw[index+1:finish][2]),
'Assoc_14':(up_file_split_raw[index+1:finish][3]),
'Assoc_15':(up_file_split_raw[index+1:finish][4]),
'Assoc_16':(up_file_split_raw[index+1:finish][5]),
'Assoc_17':(up_file_split_raw[index+1:finish][6]),
'Assoc_18':(up_file_split_raw[index+1:finish][7]),
'Assoc_19':(up_file_split_raw[index+1:finish][8]),
'Assoc_20':(up_file_split_raw[index+1:finish][9]),
}