0

df:

gender   order
F         1
F         1
M         1
F         1
M         1
F         1

Aim: To check if the mean of F and M are significantly different. I want to check if there is a significant difference between females and males for order 1. (I feel there is something wrong but I cannot figure it out at this stage). My code gives Ttest_indResult(statistic=nan, pvalue=nan) as a result; I used this ref to the below code:

from scipy.stats import ttest_ind
cat1 = df[df['gender']=='F']
cat2 = df[df['gender']=='M']
t_tst_rsult = ttest_ind(cat1['order'], cat2['order'])
print(t_tst_rsult)
SaNa
  • 333
  • 1
  • 3
  • 13
  • Have you looked at `cat1['order']` and `cat2['order']`? Are they indeed floating point Series? – DYZ Jul 29 '21 at 07:09
  • 3
    the 2 groups (`M` and `F`) have identical values, so no difference will be detected by the test, hence the nan pvalues – Simon Jul 29 '21 at 07:09
  • As Simon said +1, change the values in your order column and re-run the same code and you'll see proper results – sophocles Jul 29 '21 at 07:10
  • All the records in order have value=1. Do you think t-test is a proper solution to see the difference? – SaNa Jul 29 '21 at 07:11
  • 1
    well because the values are identical, there is literally no difference to be seen – Simon Jul 29 '21 at 07:12
  • 2
    if everything in `order` is 1, there's no standard deviation, or variation, there's no purpose in doing any test whatsoever – StupidWolf Jul 29 '21 at 07:20

1 Answers1

0

A better test for this sample is One Sample Test of Proportions in case someone has the same issue.

SaNa
  • 333
  • 1
  • 3
  • 13