Search for values in all text files and multiply them by fixed value ? (in PYTHON ?)

Question

-

Hi friends.

I have a lot of files, which contains text information, but I want to search only specific lines, and then in these lines search for on specific position values and multiply them with fixed value (or entered with input).

Example text:

1,0,0,0,1,0,0
15.000,15.000,135.000,15.000
7
3,0,0,0,2,0,0
'holep_str',50.000,-15.000,20.000,20.000,0.000
3
3,0,0,100,3,-8,0
58.400,-6.600,'14',4.000,0.000
4
3,0,0,0,3,-8,0
50.000,-15.000,50.000,-15.000
7
3,0,0,0,4,0,0
'holep_str',100.000,-15.000,14.000,14.000,0.000
3
3,0,0,100,5,-8,0
108.400,-6.600,'14',4.000,0.000

And I want to identify and modify only lines with "holep_str" text:

'holep_str',50.000,-15.000,20.000,20.000,0.000
'holep_str',100.000,-15.000,14.000,14.000,0.000

There are in each line that begins with the string "holep_str" two numbers, at position 3rd and 4th value:

20.000 20.000
14.000 14.000

And these can be identified like:
1./ number after 3rd comma on line beginning with "holep_str"
2./ number after 4th comma on line beginning with "holep_str"

RegEx cannot help, Python probably sure, but I'm in time press - and go no further with the language...

Is there somebody that can explain how to write this relative simple code, that finds all lines with "search string" (= "holep_str") - and multiply the values after 3rd & 4th comma by FIXVALUE (or value input - for example "2") ?

The code should walk through all files with defined extension (choosen by input - for example txt) where the code is executed - search all values on needed lines and multiply them and write back...

So it looks like - if FIXVALUE = 2:
'holep_str',50.000,-15.000,40.000,40.000,0.000
'holep_str',100.000,-15.000,28.000,28.000,0.000

And whole text looks like then:
1,0,0,0,1,0,0
15.000,15.000,135.000,15.000
7
3,0,0,0,2,0,0
'holep_str',50.000,-15.000,40.000,40.000,0.000
3
3,0,0,100,3,-8,0
58.400,-6.600,'14',4.000,0.000
4
3,0,0,0,3,-8,0
50.000,-15.000,50.000,-15.000
7
3,0,0,0,4,0,0
'holep_str',100.000,-15.000,28.000,28.000,0.000
3
3,0,0,100,5,-8,0
108.400,-6.600,'14',4.000,0.000

Thank You.

why not check each line, see if it `.startswith(your_string)` and then `.split(',')` the line — ryugie, Mar 13 '17 at 16:01
thank you for tip. will investigate tomorrow. but what then with the splitted parts? how to select the first number - then convert it to real number - and then multiply by FIXVALUE ? I don't need full code - only the right direction. I'm new to python. — Peter Maly, Mar 13 '17 at 20:57
how to put here the code ? - I'm in progress to show first good results and then write them back to file... @takendarkk — Peter Maly, Mar 14 '17 at 14:47

score 0 · Answer 1 · answered Mar 13 '17 at 21:39

0

with open(file_path) as f:
    lines = f.readlines()

for line in lines:
    if line.startswith(r"'holep_str'"):
        split_line = line.split(',')
        num1 = float(split_line[3])
        num2 = float(split_line[4])
        print num1, num2
        # do stuff with num1 and num2

Once you .split() the lines with the argument ,, you get a list. Then, you can find the values you want by index, which are 3 and 4 in your case. I also convert them to float at the end.

answered Mar 13 '17 at 21:39

ryugie

181
1
1
7

yes yes yes. I knew that something like this must be possible :-) it grew in my head during night :-) it is a pity that man cannot learn all of the language first and then create something creative... now, lets go to invetigate, how to write it back to the line at the same position after multiplying them.Thank You ryugie (if You want You can direct me again to right function :-) ) – Peter Maly Mar 14 '17 at 06:19
I'm somewhere stuck... :-( I cannot find way how to write multiplied values back to their original positions and save the file... :-( any idea ? – Peter Maly Mar 14 '17 at 10:19
I have created script, that simple (flawless) reads all lines, which begins with specified "start_string" in all files with entered (input) extension in entered (input) directory, and multiply them with entered (input) number. I understand now how it work little bit, but what function to use to replace split_line[x] with new multiplied numbers ? – Peter Maly Mar 14 '17 at 13:22
split_line = line.split(',') old_num1 = float(split_line[3]) old_num2 = float(split_line[4]) new_num1 = old_num1 * multiply_value new_num2 = old_num2 * multiply_value split_line[3] = str(new_num1) split_line[4] = str(new_num2) line = ','.join(split_line) – Peter Maly Mar 14 '17 at 14:17
cannot get the text into separate lines... :-( – Peter Maly Mar 14 '17 at 14:17
now I cannot get how to write the line back to original position... :-( here I'm stuck again... :-( – Peter Maly Mar 14 '17 at 14:23
@PeterMaly - Check out this answer or the answers above it: http://stackoverflow.com/a/1811866/5971855 – ryugie Mar 14 '17 at 22:53
looks good :-) will investigate today. also I see option into read all lines from original file and write them to new file - and when the IF SOMETHING IS THERE (e.g. line begins with 'holep_str') - then change the line (multiply split_line[3] & [4}) before writing them to new file - and then rename original to .OLD and new to original.ext – Peter Maly Mar 15 '17 at 07:03
READY GO!!! - it is done !!! I used my method of reading all lines and writing them to temp file... it works flawless at sample files. will post here the code asap. – Peter Maly Mar 15 '17 at 10:26

score 0 · Accepted Answer · answered Mar 15 '17 at 14:55

Also final solution - whole program (version: python-3.6.0-amd64):

# import external functions / extensions ...
import os  
import glob

# functions definition section
def fnc_walk_through_files(path, file_extension):
   for (dirpath, dirnames, filenames) in os.walk(path):
      for filename in filenames:
         if filename.endswith(file_extension): 
            yield os.path.join(path, filename)

# some variables for counting
line_count = 0

# Feed data to program by entering them on keyboard
print ("Enter work path (e.g. d:\\test) :")
workPath = input( "> " )
print ("File extension to perform Search-Replace on [spf] :")
fileExt = input( "> " )
print ("Enter multiplier value :")
multiply_value = input( "> " )
print ("Text to search for :")
textToSearch = input( "> " ) 

# create temporary variable with path and mask for deleting all ".old" files
delPath = workPath + "\*.old"
# delete old ".old" files to allow creating backups
for files_to_delete in glob.glob(delPath, recursive=False):
    os.remove(files_to_delete)

# do some needed operations...
print("\r") #enter new line
multiply_value = float(multiply_value) # convert multiplier to float
textToSearch_mod = "\'" + textToSearch # append apostrophe to begin of searched text
textToSearch_mod = str(textToSearch_mod) # convert variable to string for later use
# print information line of what will be searched for
print ("This is what will be searched for, to identify right line: ", textToSearch_mod)
print("\r") #enter new line

# walk through all files with specified extension <-- CALLED FUNCTION !!!
for fname in fnc_walk_through_files(workPath, fileExt):
   print("\r") # enter new line
   # print filename of processed file
   print("          Filename processed:", fname )
   # and proccess every file and print out numbers
   # needed to multiplying located at 3rd and 4th position
   with open(fname, 'r') as f: # opens fname file for reading
       temp_file = open('tempfile','w') # open (create) tempfile for writing
       lines = f.readlines() # read lines from f:
       line_count = 0 # reset counter
       # loop througt all lines
       for line in lines:
           # line counter increment
           line_count = line_count + 1
           # if line starts with defined string - she will be processed
           if line.startswith(textToSearch_mod):
               # line will be divided into parts delimited by ","
               split_line = line.split(',')
               # transfer 3rd part to variable 1 and make it float number
               old_num1 = float(split_line[3])
               # transfer 4th part to variable 2 and make it float number
               old_num2 = float(split_line[4])
               # multiply both variables
               new_num1 = old_num1 * multiply_value
               new_num2 = old_num2 * multiply_value
               # change old values to new multiplied values as strings
               split_line[3] = str(new_num1)
               split_line[4] = str(new_num2)
               # join the line back with the same delimiter "," as used for dividing
               line = ','.join(split_line)
               # print information line on which has been the searched string occured
               print ("Changed from old:", old_num1, old_num2, "to new:", new_num1, new_num2, "at line:", line_count)
               # write changed line with multiplied numbers to temporary file
               temp_file.write(line)
           else:
               # write all other unchanged lines to temporary file
               temp_file.write(line)
   # create new name for backup file with adding ".old" to the end of filename
   new_name = fname + '.old'
   # rename original file to new backup name
   os.rename(fname,new_name)
   # close temporary file to enable future operation (in this case rename)
   temp_file.close()
   # rename temporary file to original filename
   os.rename('tempfile',fname)

Also after 2 days after asking with a help of good people and hard study of the language :-D (indentation was my nightmare) and using some snippets of code on this site I have created something that works... :-) I hope it helps other people with similar question...

At beginning the idea was clear - but no knowledge of the language...

Now - all can be done - only what man can imagine is the border :-)

I miss GOTO in Python :'( ... I love spaghetti, not the spaghetti code, but sometimes it would be good to have some label<--goto jumps... (but this is not the case...)

it is so quick, that on more test files it has been reached a status, where it reports that the temp file is open and cannot be used again = the CLOSE command works not 100% - or it is a bug in PYTHON ? or in windows ? I have inserted 0.05s delay there and now it works OK... — Peter Maly, Mar 16 '17 at 10:27