0

I'm trying to make a simple app to get the score of anyone on a test from the following formula:

score=(grade-average)/variance for example if your score out of 20 is 18 and the average of the class is 15 then this formula helps you to understand how is your grade in comparison to others. my code opens an excel file located in my pc in reads the points column writes them to a list then gets the average and variance and uses the formula. this is my excel file. BTW the grades in the excel are just for testing. I've tried these to codes(I'm not that pro in using classes and I was trying to make some use of it): this is the first one

class taraz:
  def __init__(self,file_name,file_dir,your_point):
      self.file_name=file_name
      self.file_dir=file_dir
      self.your_point=your_point
  def sum_ave():
      f=pandas.read_excel(r (file_dir))
      point_list=f['point'].tolist()
      sum1=sum(point_list)
      ave1=sum1/len(point_list)
  def variance():
      for i in point_list:
        var1=sqrt(((i-ave1)**2)/len(point_list))
  def taraz1():
      taraz1=(your_point-ave1)/var1
      print(taraz1)
  print(taraz1)

this is the second one:

def taraz(file_name,file_dir,your_point):
  def sum_ave():
      f=pandas.read_excel(r (file_dir))
      point_list=f['point'].tolist()
      sum1=sum(point_list)
      ave1=sum1/len(point_list)
  def variance():
      for i in point_list:
          var1=sqrt(((i-ave1)**2)/len(point_list))
  def taraz1():
      taraz1=(your_point-ave1)/var1
      print(taraz1) 

from the first code I just got and output like this: <main.taraz object at 0x02528130> and from the second one I don't get an output at all. will be glad to use your tips thanks anywas.

LMKR
  • 647
  • 5
  • 12
infinite
  • 9
  • 4
  • You are failing to understand the scopes of each variable. – Adirio Sep 28 '20 at 10:47
  • For example, in the first example, what is `file_dir` inside the `sum_ave` method? How do you expect it to understand that variable? – Adirio Sep 28 '20 at 10:50
  • Additionally, why would you use pandas to read from excel, and turn that dataframe into a list and then define your own mean and variance functions when you could use `df.mean()` and `df.var()`? – Adirio Sep 28 '20 at 11:22
  • Thanks a lot well I wasn't aware that I can't use a variable outside a function without its being global. But I'm not familiar with 'df.mean' and 'df.var' would you please give a further explanation about them. – infinite Sep 28 '20 at 11:44
  • `df` is the usual variable name used for dataframes, whciha re the type of variable that pandas works with. My answer below shows how to use these two methods directly instead of re-implementing them. – Adirio Sep 28 '20 at 11:56

2 Answers2

0

Firstly, understand the scope of variable. If a variable is declared inside a method it will be accessible from that method only unless it is declared as global.

When it comes to your code, variance method inside your class.

def variance():
    for i in point_list:
        var1=sqrt(((i-ave1)**2)/len(point_list))

How the variance method will understand point_lits variable. It is neither defined not declared as global/class variable.

Second, Methods of a class will take a default parameter usually defined as self unless it is decorated as a classmethod. check here to understand about self keyword.

Third, Classes will be having objects but functions won't. So you are not able to see the bound object with the function.

So, code after adding self keyword will look like this

from os import path


class Taraz:
    def __init__(self, file_name, file_dir, your_point):
        self.file_name = file_name
        self.file_dir = file_dir
        self.your_point = your_point
        self.point_list = None
        self.ave1 = None

    def sum_ave(self):
        f = pandas.read_excel(path.join(self.file_dir, self.file_name))
        self.point_list = f['point'].tolist()
        sum1 = sum(self.point_list)
        self.ave1 = sum1 / len(self.point_list)

    def variance(self):
        if self.point_list is not None and self.ave1 is not None:
            for i in self.point_list:
                var1 = sqrt(((i-self.ave1)**2) / len(self.point_list))

    def taraz1(self):
        taraz1 = (self.your_point - self.ave1) / var1
        print(taraz1)

Edit:

>>> def func():
...     pass
...
>>> class cla:
...     pass
...
>>> func()
>>> 
>>> cla()
<__main__.cla object at 0x0000019A55944550>
>>> func
<function func at 0x0000019A552D2EA0>
>>> cla
<class '__main__.cla'>

() are used to call the method or function. Here func is a function and cla is a class. When you call a class it will return the object of the class so you see <main.cla object at 0x0000019A55944550>, but when you call function it will return the response of the function. Since my function is not having anything here, it retuned nothing.

LMKR
  • 647
  • 5
  • 12
  • `os.path.join(self.file_dir, self.file_name)` would be nicer. – Adirio Sep 28 '20 at 11:21
  • 1
    I edited to add styling (classes start by uppercase, spaces at both sides of equal signs, ...) but didn't want to change anything from the actual code without your aproval. Edited now to use `os.path.join` instead of the separator. – Adirio Sep 28 '20 at 11:26
  • Thanks a lot @LMKR I fully understood your first two objections but the third one is still unvlear to me. Plz give me further explanation if possible. – infinite Sep 28 '20 at 11:48
  • I think he means that in your second example you are defining functions inside of a function and not a class. This is not actually an error per se, functions can be declared anywhere, but for this use case it makes no sense to declare a function inside other as it will only be able to be called from inside the function. – Adirio Sep 28 '20 at 12:37
  • @LMKR They are called instances. A call to a class will return an instance of the class type (unless you mess up with meta classes and some more advanced features that could change this behavior). Object is a very generic term, even functions themselves are objects. – Adirio Sep 28 '20 at 12:58
  • @LMKR sorry but I'm still not sure how to use my code(which you edited). how should I exactly call it to get an output? `Taraz.taraz1()` only gets self as input and `taraz1.Taraz()` is not defined. if anyone could help plz mention thanks – infinite Sep 29 '20 at 07:29
0

There is no reason to use pandas to read from the excel file and then convert it into a list and re-implement basic vector operations such as mean and variance.

from os import path

import pandas as pd


class Taraz:
    def __init__(self, filepath):
        scores = pd.read_excel(filepath)['point']
        self.mean = scores.mean()
        self.var = scores.var()

    def score(self, score):
        return (score - self.mean) / self.var


if __name__ == '__main__':
    taraz = Taraz(path.join('path', 'to', 'the', 'file.xlsx'))
    print(taraz.score(16))

Output:

-0.012571428571428395

In your examples you have several errors that I would like to comment.

  1. Variable scope is important. If you assign a variable inside a function, it will be assigned just inside that function. Outside of it will raise a NameError.
  2. Methods first argument (which should be called self except for some special methods) is the instance itself, where we can assign values that will be stored in the instance and later retrieve them. For example, in the constructor (__init__) method of the code above, we are assigning somehting to self.mean and that value will be stored in the instance so that we can later use it in out score method.
  3. OOP (Object Oriented Programming) is a very well stablished coding pattern, but trying to force the use of a class for something that doesn't really represent a type seems a bit unnecesary. This could be achieved in a single function easily:
from os import path

import pandas as pd


def taraz(filepath):
    scores = pd.read_excel(filepath)
    mean = scores['point'].mean()
    var = scores['point'].var()
    scores['scores'] = (scores['point'] - mean) / var
    return scores


if __name__ == '__main__':
    print(taraz(path.join('path', 'to', 'the', 'file.xlsx')))

Output:

     name  point    scores
0    tghi     15 -0.163429
1    nghi     16 -0.012571
2   asghr     15 -0.163429
3    sbhn     20  0.590857
4    tghi     12 -0.616000
5    nghi     20  0.590857
6   asghr     17  0.138286
7    sbhn     18  0.289143
8    tghi     17  0.138286
9    nghi     16 -0.012571
10  asghr     15 -0.163429
11            12 -0.616000

As you can see, pandas dataframes implement vector operations, so (scores['point'] - mean) / var is translated to a vector of integers minus a float, divided by a float, and ther result of that operation is a vector of floats, that we store in the column 'scores'. This way we compute the scores for every row.

Adirio
  • 5,040
  • 1
  • 14
  • 26