1

I have many files .txt of this type:

name1.fits 0 0 4088.9 0. 1. 0. -0.909983 0.01386 0.91 0.01386 -0.286976 0.00379 2.979 0.03971 0. 0.
name2.fits 0 0 4088.9 0. 1. 0. -0.84702 0.01239 0.847 0.01239 -0.250671 0.00261 3.174 0.04749 0. 0.
#name3.fits 0 0 4088.9 0. 1. 0. -0.494718 0.01168 0.4947 0.01168 -0.185677 0.0042 2.503 0.04365 0. 0.
#name4.fits 0 1 4088.9 0. 1. 0. -0.751382 0.01342 0.7514 0.01342 -0.202141 0.00267 3.492 0.07224 0. 0.
name4.fits 0 1 4088.961 0.01147 1.000169 0. -0.813628 0.01035 0.8135 0.01035 -0.217434 0.00196 3.515 0.04045 0. 0.

I want to divide the values of one of these columns by the values of a column from another file of the same type. Here is what I have so far:

with open('4026.txt','r') as out1, open('4089.txt', 'r') as out2, \
     open('4116.txt', 'r') as out3, open('4121.txt', 'r') as out4, \
     open('4542.txt', 'r') as out5, open('4553.txt', 'r') as out6:

    for data1 in out1.readlines():
        col1 = data1.strip().split()
        x = col1[9]

    for data2 in out2.readlines():
        col2 = data2.strip().split()
        y = col2[9]

    f = float(y) / float(x)
    print f

However I'm getting the same values for x. For example if the first set of data is 4089.txt, and the second (4026.txt) is:

name1.fits 0 0 4026.2 0. 1. 0. -0.617924 0.01749 0.6179 0.01749 -0.19384 0.00383 2.995 0.09205 0. 0.
name2.fits 0 0 4026.2 0. 1. 0. -0.644496 0.01218 0.6445 0.01218 -0.183373 0.00291 3.302 0.05261 0. 0.
#name3.fits 0 0 4026.2 0. 1. 0. -0.507311 0.01557 0.5073 0.01557 -0.176148 0.00472 2.706 0.07341 0. 0.
#name4.fits 0 1 4026.2 0. 1. 0. -0.523856 0.01086 0.5239 0.01086 -0.173477 0.00279 2.837 0.05016 0. 0.
name4.fits 0 1 4026.229 0.0144 1.014936 0. -0.619708 0.00868 0.6106 0.00855 -0.185527 0.00189 3.138 0.04441 0. 0.

and I want to divide the 9th column of each file, taking only the first elements of each column I should get 0.91/0.6179 = 1.47, but I obtain 0.958241758242.

martineau
  • 119,623
  • 25
  • 170
  • 301
EternalGenin
  • 495
  • 2
  • 6
  • 14

2 Answers2

1

What's happening is that the code you have is capturing the last value in the for loop and dividing that. You should conduct the division at each stage of the for-loop to get the correct divisions.

An easier approach is placing all the values in a list e.g. x = [0.0149,0.01218,..etc] and y = [...]

Then you divide the two lists using numpy (or a for-loop against the lists). Remember that they both need to be of the same size to work.

Sample code:

with open('4026.txt','r') as out1, open('4089.txt', 'r') as out2,  open('4116.txt', 'r') as out3, open('4121.txt', 'r') as out4, open('4542.txt', 'r') as out5, open('4553.txt', 'r') as out6:

    # Build two lists
    x = []
    y = []

    for data1 in out1.readlines():                
        col1 = data1.strip().split()
        x.append(col1[9])

    for data2 in out2.readlines():    
        col2 = data2.strip().split()    
        y.append(col2[9])

    for i in range(0,len(x)):

        # Make sure the denominator is not zero
        if y[i] != 0:
           print (1.0 * x[i])/y[i]
        else:
           print "Not possible"
Adib
  • 1,282
  • 1
  • 16
  • 32
  • Thanks @Adib, this is very clear. However I should note that in this case I would need `print float(y[i]) / float(x[i])`. It would be great if you could point out the way to do this with numpy. – EternalGenin May 02 '16 at 19:47
  • @JVR The simplest way to handle floats is to multiply one number by 1.0. There are more ways to handle it that are possible: http://stackoverflow.com/questions/1267869/how-can-i-force-division-to-be-floating-point-in-python . – Adib May 03 '16 at 19:29
0

You could do it like this:

with open('4026.txt','r') as out1, open('4089.txt', 'r') as out2:
    x_col9 = [data1.strip().split()[9] for data1 in out1.readlines()]
    y_col9 = [data2.strip().split()[9] for data2 in out2.readlines()]

    if len(x_col9) != len(y_col9):
        print('Error: files do not have same number of rows')
    else:
        f = [(float(y) / float(x)) for x, y in zip(x_col9, y_col9)]
        print(f)

It may be better to process the files as shown below because it doesn't require reading the entire contents of all of them into memory first, and instead processes each one a line at a time:

    x_col9 = [data1.strip().split()[9] for data1 in out1]
    y_col9 = [data2.strip().split()[9] for data2 in out2]
martineau
  • 119,623
  • 25
  • 170
  • 301