Smart way to implement lookup in high-perf FORTRAN code

Question

I'm writing a simulation in FORTRAN 77, in which I need to get a parameter value from some experimental data. The data comes from an internet database, so I have downloaded it in advance, but there is no simple mathematical model that can be used to provide values on a continous scale - I only have discrete data points. However, I will need to know this parameter for any value on the x axis, and not only the discrete ones I have from the database.

To simplify, you could say that I know the value of f(x) for all integer values of x, and need a way to find f(x) for any real x value (never outside the smallest or largest x I have knowledge of).

My idea was to take the data and make a linear interpolation, to be able to fetch a parameter value; in pseudo-code:

double xd = largest_data_x_lower_than(x)

double slope = (f(xd+dx)-f(xd))/dx // dx is the distance between two x values
double xtra = x-xd

double fofx = f(xd)+slope*xtra

To implement this, I need some kind of lookup for the data points. I could make the lookup for xd easy by getting values from the database for all integer x, so that xd = int(x) and dx = 1, but I still have no idea how to implement the lookup for f(xd).

What would be a good way to implement this?

The value will be fetched something like 10^7 to 10^9 times during one simulation run, so performance is critical. In other words, reading from IO each time I need a value for f(xd) is not an option.

I currently have the data points in a text file with one pair of (tab-delimited) x,f(x) on each line, so bonus points for a solution that also provides a smooth way of getting the data from there into whatever shape it needs to be.

score 2 · Accepted Answer · edited May 23 '17 at 12:13

2

You say that that you have the values for all integers. Do you have pairs i, f(i) for all integers i from M to N? Then read the values f(i) into an array y dimensioned M:N. Unless the number of values is HUGE. For real values between M and N it is easy to index into the array and interpolate between the nearest pair of values.

And why use FORTRAN 77? Fortran 90/95/2003 have been with us for some years now...

EDIT: Answering question in the comment, re how to read the data values only once, in FORTRAN 77, without having to pass them as an argument in a long chain of calls. Technique 1: on program startup, read them into the array, which is in a named common block. Technique 2: the first time the function that returns f(x) is called, read the values into a local variable that is also on a SAVE statement. Use a logical which is SAVEd to designate whether or not the function is on its first call or not. Generally I'd prefer technique 2 as being more "local", but its not thread safe. If you are the doing simulation in parallel, the first technique could be done in a startup phase, before the program goes multi-threaded.

Here is an example of the use of SAVE: fortran SAVE statement. (In Fortran 95 notation ... convert to FORTRAN 77). Put the read of the data into the array in the IF block.

edited May 23 '17 at 12:13

Community

1
1

answered Apr 06 '13 at 01:06

M. S. B.

28,968
2
46
73

FORTRAN 77 is imposed on me by the group I'm working in. It's a temporary position, so I've decided the gain of forcing the group to move on is smaller than just learning :P How do I read the values into memory *only once*, without having to pass the arrays around all the time? This lookup is quite a way down the stack - if possible I'd like to avoid having to add an extra couple of function arguments in four or five places... – Tomas Aschan Apr 06 '13 at 01:21
It is in parallel, using MPI - in other words, I run the program in a number of separate processes (but I don't do any thread-branching within the processes). I think the common block will be the best solution, to ensure that I can read the file in a safe way - I'll try it out tomorrow and see if I can get it to work. – Tomas Aschan Apr 07 '13 at 21:19
OK; I now have decided on an approach that will work, thread-safely, in my parallel environment. I'll read in the file and create an interpolation with fine-enough x resolution in the master thread, and then broadcast that thread to all other threads. The vector will reside in a common-block, to avoid passing it around as an argument. If I run into problems with this, it's probably not related to the choice of approach, but rather to me being a clumsy programmer, so I'll see this as resolved now. Thanks for the help! – Tomas Aschan Apr 08 '13 at 16:42

score 0 · Answer 2 · answered Apr 05 '13 at 22:59

0

you probably want a way to interpolate or fit your data, but you need to be more specific about, say, dimensionality of your data, how your data behave, what fashion are you accessing your data (for example, maybe your next request is always near the last one), how the grid is made (evenly spaced, random or some other fashion), and where you're needing the data to be able to know which method is the best for you.

However, if the existing data set is very dense and near linear then you can certainly do a linear interpolation.

answered Apr 05 '13 at 22:59

Xiaolei Zhu

507
4
10

Are you confusing "linear interpolation" with "linear fit"? The data is nowhere near linear, so a linear *fit* is out of the question, but if I could find a good way to look up the two data points closest to a given x value, a linear interpolation between them is just fine. (I can make sure that the data is dense enough in x-space for that.) Both `x` and `f(x)` are scalars, so a matrix with all x-values and all function values would be 2-by-N, where N is the number of data points. – Tomas Aschan Apr 05 '13 at 23:05

score 0 · Answer 3 · answered Apr 07 '13 at 18:41

Using your database (file), you could create an array fvals with fvals(ii) being the function f(xmin + (ii-1) * dx). The mapping between x-value xx and your array index is ii = floor((xx - xmin) / dx) + 1. Once you know ii, you can use the points around it for interpolation: Either doing linear interpolation using ii and ii+1 or some higher order polynomial interpolation. For latter, you could use the corresponding polint routine from Numerical Recipes. See page 103.

Smart way to implement lookup in high-perf FORTRAN code

3 Answers3