1

I have these 2 sets, Set A, and Set B (https://paste.debian.net/343292/) that contains data of several previous executions. The Set B contains the total execution times, and Set A contains several variables that were at the time of execution.

I have this code that calculates multivariate linear regression [1], but in the end the predicted time is a negative value. I don't know if I have a problem in the python code, or in the 2 sets, or in the way I calculate the new time. Where do I have this problem?

[1] Python code

xx = np.array(set_a)
yy = np.array(set_b)

A = np.column_stack((xx, np.ones(len(xx))))

# linearly generated sequence
coeffs = linalg.lstsq(A, yy)[0]  # obtaining the parameters

wqueueacapacity = coeffs[0]
wbytesread = coeffs[1]
wmaps = coeffs[2]
wcpu_info = coeffs[3]
wmem_info = coeffs[4]

# I predict the time by multiplying weights with new params that I don't depict here.
time = (wbytesread * params[1]) + (wqueueacapacity * params[0]) + (wmaps * params[2]) + (wcpu_info * params[3]) + (wmem_info * params[4]) + coeffs[5]
xeon123
  • 819
  • 1
  • 10
  • 25
  • this question is similar and may help you out http://stackoverflow.com/q/33964913/832621 – Saullo G. P. Castro Dec 11 '15 at 13:28
  • Yes, it is similar, but I still don't understand quite the solution, and I am also facing some difficulty in understanding how I can take the solution on the link into my problem. – xeon123 Dec 11 '15 at 14:16
  • In my case, the way that I calculate time is correct? Which coefficient is the one that represents the error? The first, or the last? Why am I getting negative time? – xeon123 Dec 12 '15 at 15:17

0 Answers0