How can i make this calculations from file?

Question

I have a file contains two columns and need to apply this equation on them like

the equation is

de = sqrt((xi-xj)^2-(yi-yj)^2)

it means the result will be a column

row1 = sqrt((x1-x2)^2-(y1-y2)^2)

row2 = sqrt((x1-x3)^2-(y1-y3)^2)

and do this equation for each point x1 to other points and y1 for other points until finished then start to calculate

row 6 = sqrt((x2-x3)^2-(y2-y3)^2)

row 7 = sqrt((x2-x4)^2-(y2-y4)^2)

and do this equation for each point x2 to other points and y2 for other points until finished and so on until finished all x and y and store the result in a file

I tried to do this by using 2 arrays and stored the numbers on them then make calculations but the data is too huge and the array will be the wrong choice .. how can I do this in python .. reading from file the I and j for each value

my tries and sorry if it's too bad

import math
with open('columnss.txt', 'r', encoding='utf-8') as f:
      for line in f: 
           [x, y] = (int(n) for n in line.split())
           d = math.sqrt(((x[0] - y[0])**2) + ((x[1] - y[1])** 2)) 
           with open('result.txt', 'w', encoding='utf-8') as f1:
                  f1.write( str(d) + '\n')

i got

ValueError: invalid literal for int() with base 10: '-9.2'

I did the calculations in excel but trying to use python for it too Should I put each column in a separate file to be easier for catching numbers or can I do this with the same file?

*

Barmar · Accepted Answer · 2021-02-01T23:51:49.740

You need to loop through the input file twice. The second loop can skip all the lines that are before the line from the first loop.

If you could load the file contents into a list or array, you could do this more easily by iterating over indexes rather than skipping lines.

Also, you should only open the output file once. You're overwriting it every time through the loop.

import cmath

with open('columnss.txt', 'r', encoding='utf-8') as f1, open('columnss.txt', 'r', encoding='utf-8') as f2, open('result.txt', 'w', encoding='utf-8') as outfile:
    for i1, line in enumerate(f1):
        x1, y1 = (float(n) for n in line.split())
        f2.seek(0)
        for i2, line in enumerate(f2):
            if i1 < i2:
                x2, y2 = (float(n) for n in line.split())
                print(cmath.sqrt((x1-x2)**2-(y1-y2)**2), file=outfile)

I have used the increment variable but it couldn't affect the code .. I learned from you a lot today .. thanks. — user1, Feb 02 '21 at 01:06

sudhish · Answer 2 · 2021-02-01T22:17:24.440

Whenever there is a problem which usually looks like something which can be done in an excel sheet, and want to enable a python way of doing it I use pandas.

I am assuming pandas is ok for you to use too.

Here is the code for 'columns.txt' file read and output as 'output.csv'

import pandas as pd
import cmath
df = pd.read_csv('columns.txt', sep=r"\s+") # read columns.txt into a dataframe, using space as deliimter
df.dropna(inplace=True,axis=1)                 # multiple whitespaces create NA columns. Better to use csv file
df = df.astype(float)                          # specify the columsn as type float
print("-"*20 + "Input" + "-"*20)
print(df)                                      # 
print("-"*50)

for index, row in df.iterrows():
    origin=row                              # specify current row as origin

    '''
    Adding  equation column
    Here we are using a lambda function (same as de used in the question)
    and creating a new column called equation
    '''
    df["equation from row {}".format(index)]=df.apply(lambda row_lambda: cmath.sqrt((origin.x-row_lambda.x)**2 - (origin.y-row_lambda.y)**2), axis=1)

print("-"*20 + "Output" + "-"*20)
print(df)
print("-"*50)

# Save this output as csv file (even excel is possible)
df.to_csv('Output.csv')```


The output will look like:

    --------------------Input--------------------
             x         y
    0 -99.9580 -28.84930
    1 -71.5378 -26.77280
    2 -91.6913 -40.90390
    3 -69.0989 -12.95010
    4 -79.6443  -9.20575
    5 -92.1975 -20.02760
    6 -99.7732 -14.26070
    7 -80.3767 -18.16040
    --------------------------------------------------
    --------------------Output--------------------
             x         y      distance from row 0      distance from row 1  \
    0 -99.9580 -28.84930                       0j  (28.344239552155912+0j)   
    1 -71.5378 -26.77280  (28.344239552155912+0j)                       0j   
    2 -91.6913 -40.90390       8.773542743384796j  (14.369257985017867+0j)   
    3 -69.0989 -12.95010  (26.448052710360358+0j)      13.605837059144871j   
    4 -79.6443  -9.20575   (5.174683670283624+0j)      15.584797189970107j   
    5 -92.1975 -20.02760       4.194881481043308j  (19.527556965734348+0j)   
    6 -99.7732 -14.26070      14.587429482948666j   (25.31175945583396+0j)   
    7 -80.3767 -18.16040   (16.40654523292457+0j)  (1.9881447256173002+0j)   
    
           distance from row 2      distance from row 3      distance from row 4  \
    0       8.773542743384796j  (26.448052710360358+0j)   (5.174683670283624+0j)   
    1  (14.369257985017867+0j)      13.605837059144871j     -15.584797189970107j   
    2                       0j      16.462028935705348j      29.319660714655278j   
    3      16.462028935705348j                       0j   (9.858260710566546-0j)   
    4      29.319660714655278j   (9.858260710566546+0j)                       0j   
    5       20.87016203219323j  (21.987594586720945+0j)   (6.361634445447185+0j)   
    6      25.387851398454337j  (30.646288651809048+0j)  (19.483841913429192+0j)   
    7       19.72933397482034j  (10.002077121778257+0j)       8.924648276682952j   
    
           distance from row 5      distance from row 6      distance from row 7  
    0       4.194881481043308j      14.587429482948666j   (16.40654523292457+0j)  
    1  (19.527556965734348-0j)   (25.31175945583396-0j)  (1.9881447256173002-0j)  
    2      -20.87016203219323j     -25.387851398454337j       19.72933397482034j  
    3  (21.987594586720945+0j)  (30.646288651809048+0j)  (10.002077121778257+0j)  
    4   (6.361634445447185+0j)  (19.483841913429192+0j)       8.924648276682952j  
    5                       0j   (4.912646423263124-0j)  (11.672398074089152+0j)  
    6   (4.912646423263124+0j)                       0j  (19.000435578165046+0j)  
    7  (11.672398074089152+0j)  (19.000435578165046-0j)                       0j  
    --------------------------------------------------

To know more about pandas:
[https://pandas.pydata.org/docs/][1]

Stackoverflow itself is an excellent resource for gathering all way of using pandas.


  [1]: https://pandas.pydata.org/docs/



Here column names are defined as 'x' and 'y' in the header.
If the column names are not specified you can add a new header by:
df.columns=['x','y'] 
after reading the csv file (or text file).

If it already has a header and want to use that name just specify that in the lambdas formula.

Please see: 
https://stackoverflow.com/questions/14365542/import-csv-file-as-a-pandas-dataframe

Hope this helps

This isn't a distance formula. It's subtracting the squares of the differences, not adding them. — Barmar, Feb 01 '21 at 21:34
My mistake - I think cmath will work on top of this very easily df["distance from row {}".format(index)]=df.apply(lambda row_lambda: cmath.sqrt((origin.x-row_lambda.x)**2 - (origin.y-row_lambda.y)**2), axis=1) — sudhish, Feb 01 '21 at 21:58
What is `origin`? The question says he wants to calculate the formula using every pair of rows, not just from the first row. — Barmar, Feb 01 '21 at 22:00
I think I have answered it twice- I'm going to delete this version. — sudhish, Feb 01 '21 at 22:02
Maybe working through the header will help. I have seen that pandas are very efficient and intuitive once you start using it. — sudhish, Feb 01 '21 at 22:10
I will try using pandas in another problem. thanks for your effort .. still having a simple problem in the counter that will increment to get the other results for others' values. where can I put the increment variable to get the next value from the file If I'm using the code in my post. can you help ? the logic like the distance but I need to subtract not adding — user1, Feb 01 '21 at 22:35

sudhish · Answer 3 · 2021-02-01T22:04:09.827

Whenever there is a problem which usually looks like something which can be done in an excel sheet, and want to enable a python way of doing it I use pandas.

I am assuming pandas is ok for you to use too.

Here is the code for 'columns.txt' file read and output as 'output.csv' which finds distance of each rows from others and adds a new column

import pandas as pd
import cmath
df = pd.read_csv('columns.txt', sep=r"\s+") # read columns.txt into a dataframe, using space as deliimter
df.dropna(inplace=True,axis=1)                 # multiple whitespaces create NA columns. Better to use csv file
df = df.astype(float)                          # specify the columsn as type float
print("-"*20 + "Input" + "-"*20)
print(df)                                      # 
print("-"*50)

for index, row in df.iterrows():
    origin=row                              # specify first row as origin

    '''
    Adding distance column
    Here we are using a lambda function (same as de used in the question)
    and creating a new column called distance
    '''
    df["distance from row {}".format(index)]=df.apply(lambda row_lambda: cmath.sqrt((origin.x-row_lambda.x)**2 - (origin.y-row_lambda.y)**2), axis=1)

print("-"*20 + "Output" + "-"*20)
print(df)
print("-"*50)

# Save this output as csv file (even excel is possible)
df.to_csv('Output.csv')```


The output will look like:

--------------------Input--------------------
         x         y
0 -99.9580 -28.84930
1 -71.5378 -26.77280
2 -91.6913 -40.90390
3 -69.0989 -12.95010
4 -79.6443  -9.20575
5 -92.1975 -20.02760
6 -99.7732 -14.26070
7 -80.3767 -18.16040
--------------------------------------------------
--------------------Output--------------------
         x         y      distance from row 0      distance from row 1  \
0 -99.9580 -28.84930                       0j  (28.344239552155912+0j)   
1 -71.5378 -26.77280  (28.344239552155912+0j)                       0j   
2 -91.6913 -40.90390       8.773542743384796j  (14.369257985017867+0j)   
3 -69.0989 -12.95010  (26.448052710360358+0j)      13.605837059144871j   
4 -79.6443  -9.20575   (5.174683670283624+0j)      15.584797189970107j   
5 -92.1975 -20.02760       4.194881481043308j  (19.527556965734348+0j)   
6 -99.7732 -14.26070      14.587429482948666j   (25.31175945583396+0j)   
7 -80.3767 -18.16040   (16.40654523292457+0j)  (1.9881447256173002+0j)   

       distance from row 2      distance from row 3      distance from row 4  \
0       8.773542743384796j  (26.448052710360358+0j)   (5.174683670283624+0j)   
1  (14.369257985017867+0j)      13.605837059144871j     -15.584797189970107j   
2                       0j      16.462028935705348j      29.319660714655278j   
3      16.462028935705348j                       0j   (9.858260710566546-0j)   
4      29.319660714655278j   (9.858260710566546+0j)                       0j   
5       20.87016203219323j  (21.987594586720945+0j)   (6.361634445447185+0j)   
6      25.387851398454337j  (30.646288651809048+0j)  (19.483841913429192+0j)   
7       19.72933397482034j  (10.002077121778257+0j)       8.924648276682952j   

       distance from row 5      distance from row 6      distance from row 7  
0       4.194881481043308j      14.587429482948666j   (16.40654523292457+0j)  
1  (19.527556965734348-0j)   (25.31175945583396-0j)  (1.9881447256173002-0j)  
2      -20.87016203219323j     -25.387851398454337j       19.72933397482034j  
3  (21.987594586720945+0j)  (30.646288651809048+0j)  (10.002077121778257+0j)  
4   (6.361634445447185+0j)  (19.483841913429192+0j)       8.924648276682952j  
5                       0j   (4.912646423263124-0j)  (11.672398074089152+0j)  
6   (4.912646423263124+0j)                       0j  (19.000435578165046+0j)  
7  (11.672398074089152+0j)  (19.000435578165046-0j)                       0j  
--------------------------------------------------
To know more about pandas:
[https://pandas.pydata.org/docs/][1]

Stackoverflow itself is an excellent resource for gathering all way of using pandas.


  [1]: https://pandas.pydata.org/docs/

thanks for your effort . i tried the code but got line 2, in df = pd.read_csv('columnss.txt', delimiter=" ") pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 1792, saw 3 — user1, Feb 01 '21 at 21:44
You can save columns.txt as a csv file and read it directly. I did not see the error on using the example input. — sudhish, Feb 01 '21 at 21:46
https://stackoverflow.com/questions/18039057/python-pandas-error-tokenizing-data — sudhish, Feb 01 '21 at 21:47
I think it could be because youre data file has lines that are empty — sudhish, Feb 01 '21 at 21:48
Can you paste the header (maybe 20 lines) of your file in the question? — sudhish, Feb 01 '21 at 21:49
i edit the post with the first 8 lines . do you need more or its enough for trying > — user1, Feb 01 '21 at 21:53
when data have inconsistent whitespaces then pandas may read columns differently — sudhish, Feb 01 '21 at 21:53
https://stackoverflow.com/questions/16022094/using-pandas-to-read-text-file-with-leading-whitespace-gives-a-nan-column — sudhish, Feb 01 '21 at 21:53
i tried your line and got AttributeError: 'Series' object has no attribute 'x' — user1, Feb 01 '21 at 21:54

Belhadjer Samir · Answer 4 · 2021-02-01T20:55:19.783

0

try this :

import math
with open('result.txt', 'w', encoding='utf-8') as f1:
    with open('columnss.txt', 'r', encoding='utf-8') as f:
           while True :
              line=f.readline()
              [x, y] = (int(float(n)) for n in line.split())
              if ((x[0] - y[0])**2) + ((x[1] - y[1])** 2)> 0:
                d = math.sqrt(((x[0] - y[0])**2) + ((x[1] - y[1])** 2)) 
                f1.write(line +':' str(d) + '\n')
              if not line :
                 break 
f.close()
f.close()

edited Feb 01 '21 at 20:55

answered Feb 01 '21 at 20:44

Belhadjer Samir

1,461
7
15

still got the same error [x, y] = (int(n) for n in line.split()) ValueError: invalid literal for int() with base 10: '-9.0 – user1 Feb 01 '21 at 20:46
thanks for your helping i checked and got ValueError: math domain error – user1 Feb 01 '21 at 20:51
the problem is with your value there are some cas where you trying to sqr negative value ,try my update i removed the negative value from calculation – Belhadjer Samir Feb 01 '21 at 20:56
i upvote your answer as the effort you helped me .. thanks .. but I'm still have a problem with the sequence of points which will be calculated as I got the result of the wrong number – user1 Feb 01 '21 at 21:22

How can i make this calculations from file?

4 Answers4