2

I have an error when I run the following codes for the question. Question: Let's explore the relationship between being fed breastmilk as a child and getting a seasonal influenza vaccine from a healthcare provider. Return a tuple of the average number of influenza vaccines for those children we know received breastmilk as a child and those who know did not.

This function should return a tuple in the form (use the correct numbers):

(2.5, 0.1)

codes:

def average_influenza_doses():
    # YOUR CODE HERE
    # raise NotImplementedError()
    import pandas as pd
    import numpy as np
    df = pd.read_csv("assests/NISPUF17.csv", index_col=0)
   
    cbf_flu=df.loc[:,['CBF_01','P_NUMFLU']]
   
   
    cbf_flu1=cbf_flu[cbf_flu['CBF_01'] ==1].dropna()
    cbf_flu2=cbf_flu[cbf_flu['CBF_01'] ==2].dropna()
   
    flu1=cbf_flu1['P_NUMFLU'].values.copy()
    flu1[np.isnan(flu1)] = 0
    f1=np.sum(flu1)/len(flu1)
   
    flu2=cbf_flu2['P_NUMFLU'].values.copy()
    flu2[np.isnan(flu2)] = 0
    f2=np.sum(flu2)/len(flu2)
   
    aid =(f1,f2)
    return aid

assert len(average_influenza_doses())==2, "Return two values in a tuple, the first for yes and the second for no."

Iman Shafiei
  • 1,497
  • 15
  • 21
  • Hi, welcome to the StackOverflow community. Please read this helpful post to ask better questions. Please share the errors you get. Providing the input you have and the output you want will help to answer your question. https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – Iman Shafiei Apr 26 '21 at 16:28

5 Answers5

0

You mispelled the directory

WRONG

df = pd.read_csv("assests/NISPUF17.csv", index_col=0)

RIGHT

df = pd.read_csv("**assets**/NISPUF17.csv", index_col=0)

I only found this mistake

Michael Halim
  • 262
  • 2
  • 20
0

Your code should look like this. Instead of using .values, just get the Pandas Series and apply the aggregate function .sum() to it.

import pandas as pd
import numpy as np
def average_influenza_doses():

    df = pd.read_csv("assets/NISPUF17.csv", index_col=0)

    cbf_flu = df[['CBF_01','P_NUMFLU']]


    cbf_flu1 = cbf_flu[cbf_flu['CBF_01'] == 1].dropna()
    cbf_flu2 = cbf_flu[cbf_flu['CBF_01'] == 2].dropna()

    flu1 = cbf_flu1['P_NUMFLU']
    f1 = flu1.sum() / flu1.size

    flu2 = cbf_flu2['P_NUMFLU']
    f2 = flu2.sum() / flu2.size

    print(f1, f2)
    return (f1,f2)
Damir Temir
  • 20
  • 1
  • 1
  • 5
0

One of the ways to answer this question:

Type the following code to read the given dataset

import pandas as pd
df=pd.read_csv('assets/NISPUF17.csv',index_col=0)
df

Main code

def average_influenza_doses():
    # YOUR CODE HERE
    BF_Flu=df[df['CBF_01']==1]
    avg_BF=BF_Flu['P_NUMFLU'].mean()
    NBF_Flu=df[df['CBF_01']==2]
    avg_NBF=NBF_Flu['P_NUMFLU'].mean()
    tup=(avg_BF,avg_NBF)
    return tup
    raise NotImplementedError()

Execute using the following code

average_influenza_doses()

Check using the following code as already given

assert len(average_influenza_doses())==2, "Return two values in a tuple, the first for yes and the second for no."

[CBF_01]=1 - Received breast milk

[CBF_01]=1 - Not received breast milk

[P_NUMFLU] - No. of children affected by flu

0
    def average_influenza_doses():
    # YOUR CODE HERE
        df=pd.read_csv('assets/NISPUF17.csv',index_col=0)
        bf = df[df['CBF_01']==1]
        nbf = df[df['CBF_01']!=1]
        mean_bf = bf['P_NUMFLU'].mean()
        mean_nbf = nbf['P_NUMFLU'].mean()
        return ((mean_bf,mean_nbf))
0

def average_influenza_doses():

import pandas as pd
df=pd.read_csv("assets/NISPUF17.csv", index_col=0)
new_df = df[(df["CBF_01"]==1).dropna() | (df["CBF_01"]==2).dropna()]
a,b=new_df.groupby(["CBF_01"]).P_NUMFLU.mean()
tup = (a,b)
return tup
raise NotImplementedError()
  • Would you double-check your comment and fix formatting? Don't know what you meant here, but pretty sure that it's not in the code. – ravenwing Oct 17 '22 at 21:32