3

I have a function that plots multiples graphs. The x-axis of one particular graph can be linear or logarithmic. But, passing a list as a parameter indicating which graph are logarithmic is not desired. I prefer that to be transparent analysing the data.

x_linear = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x_log = [1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-7, 1e-8, 1e-9, 1e-10, 1e-11, 1e-12, 1e-13, 1e-14, 1e-15]

Something as islog(x_linear) returns False, and islog(x_log) returns True.

Not always the x values are exactly linear-ish, nor logarithmic-ish. They could be:

x1 = [10, 20, 50, 100, 200, 500, 1000, 1200, 1500, 1800, 1900, 2000, 2100, 3000]
x2 = [1, 2, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377]
x3 = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 20]

They are some kind of exponential, but not enough to need a logarithmic x-axes.

juanmah
  • 1,130
  • 1
  • 14
  • 21
  • You can read 3 successive values (a,b,c) from list and then check if `b-a == c-b` which will be true only for linear list. – brokenfoot Sep 27 '19 at 20:22
  • Not always the linear x-axis is successive, or at equal distances. – juanmah Sep 27 '19 at 20:27
  • could you please update the question with these specific details? – user7440787 Sep 27 '19 at 20:28
  • If the X axis is not at regular intervals what defines logarithmic for you? If the intervals aren't equal either of these sequences could be generated by a linear or logarithmic function. – Simon Notley Sep 27 '19 at 20:32
  • 2
    This is really a question about data analysis. You can compute the exponential of `x_log`, then [fit a line](https://en.wikipedia.org/wiki/Linear_regression) to the data, and finally compute a [goodness of fit measure](https://en.wikipedia.org/wiki/Goodness_of_fit) and judge from that if it is logarithmic enough. For linear the same but skip the exponentiation. If the only two cases are linear and logarithmic and nothing else, you could test for linearity. – conditionalMethod Sep 27 '19 at 20:41
  • @conditionalMethod Yes, it's something like that. The theory it's a good start. But there exists some function to evaluate that without reinventing the wheel? Some function that says how exponential is the series. – juanmah Sep 27 '19 at 20:47
  • 1
    In [this answer](https://stackoverflow.com/a/1517401/12043501) they explain how to do it with numpy. The output `r_value` is the goodness of fit in this case. You want to to be rather close to 1. To say that the line fits well. How close? Your call. – conditionalMethod Sep 27 '19 at 20:52
  • Why not just pass another parameter indicating whether it should be treated as logarithmic? – Barmar Sep 27 '19 at 20:56
  • @conditionalMethod Thanks, r_value is what I wanted. – juanmah Sep 27 '19 at 21:14
  • @barman more code, more headaches, specially if you have to touch what is done. Some old code has to be revised and refactorized. Why to be manual If it can be automated? Simplier is not always better. – juanmah Sep 27 '19 at 21:17
  • Firstly curve fitting to try to figure out what function generated your own axis is craziness and just asking for trouble. Secondly you can't fit a curve to a one-dimensional series. The OP stated that the intervals aren't necessariliy equal so we can't assume the steps in the other dimension are equal and therefore we can't fit a curve - see my earlier comment. – Simon Notley Sep 27 '19 at 21:24
  • @SimonN fit one-dimensional series is not difficult, you only need other linear one-dimensional series. See my own answer. – juanmah Sep 27 '19 at 21:32
  • Same question was already asked here: https://stackoverflow.com/questions/36795949/python-testing-if-my-data-follows-a-lognormal-distribution – PythonNoob Sep 28 '19 at 13:37
  • @SimonN You just didn't understand what they said. They meant that the linear case wasn't an arithmetic progression, as the test in the first comment would imply. – conditionalMethod Sep 29 '19 at 13:32

1 Answers1

2

r_value of linregress function tells how linear is the series. Thanks to @conditionalMethod for his comment.

from scipy import stats

x_linear = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x_log = [1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-7, 1e-8, 1e-9, 1e-10, 1e-11, 1e-12, 1e-13, 1e-14, 1e-15]
x1 = [10, 20, 50, 100, 200, 500, 1000, 1200, 1500, 1800, 1900, 2000, 2100, 3000]
x2 = [1, 2, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377]
x3 = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 20]

stats.linregress(x_linear, range(0, len(x_linear))) # rvalue=1.0
stats.linregress(x_log, range(0, len(x_log))) #rvalue=-0.49016926355349816
stats.linregress(x1, range(0, len(x1))) # rvalue=0.9722589459436218
stats.linregress(x2, range(0, len(x2))) # rvalue=0.8358503295705382
stats.linregress(x3, range(0, len(x3))) # rvalue=0.9325110133355075

Values close to 1 are more linear. The threshold can be easily choosen.

Also, it can be used to say how logarithmic is the series. (It has to match if the series is ascendent or descendent).

import numpy as np

stats.linregress(x_log, np.logspace(len(x_log), 1, num=len(x_log))) # rvalue=0.9999999999999999
juanmah
  • 1,130
  • 1
  • 14
  • 21
  • 1
    You stated in an earlier comment "Not always the linear x-axis is successive, or at equal distances.", yet this answer assumes exactly that. – Simon Notley Sep 27 '19 at 21:36