0

I have the xls data ( Master Sheet) attached in the [![sample xlsx data][1]][1]screenshot here.

As a beginner in Python and Data science , I am trying to read the sheets in xlsx file as below:

Load pandas

import pandas as pd

# Read CSV file into DataFrame df
sheet1, sheet2 = None, None
with pd.ExcelFile("blood_data.xlsx") as reader:
    sheet1 = pd.read_excel(reader, sheet_name='Master sheet')
    sheet2 = pd.read_excel(reader, sheet_name='Another Sheet')


# Show dataframe
print(sheet1)

I am getting all the 110 rows and columns in the output, but I want to filter based on the condition which is shown in green in the screenshot. Hb Male 13-17 and Bilirubinn 0.3-1.2 . If the values are not in range I have to print the name of the candidate. Please help me in applying a multi-column-based filter using panda to achieve my result.

Sample input file: https://github.com/wittymindstech/medical-data-analysis/blob/main/blood_data.xlsx

output: MEHRAWAN, RAKHI etc

Pixzium Media
  • 109
  • 2
  • 10
  • 2
    Please share sample dataframes (input and desired output) in a way that makes instantiation easy. I'm not typing the text from the screenshot into my console. [start here](https://stackoverflow.com/a/20159305/3620003) – timgeb Jan 07 '22 at 11:33
  • SO has quite a few questions about filtering a Pandas dataframe based on multiple criteria. See https://stackoverflow.com/questions/62363144/how-to-filter-a-two-columns-with-specific-values-in-pandas for a start. – rajah9 Jan 07 '22 at 13:36
  • Does this answer your question? [How to filter a two columns with specific values in pandas](https://stackoverflow.com/questions/62363144/how-to-filter-a-two-columns-with-specific-values-in-pandas) – rajah9 Jan 07 '22 at 13:37

1 Answers1

0

You can try something like this -

import pandas as pd

# Read CSV file into DataFrame df
sheet1, sheet2 = None, None
with pd.ExcelFile("blood_data.xlsx") as reader:
    sheet1 = pd.read_excel(reader, sheet_name='Master sheet')

indexes = sheet1[ (sheet1['Gender'] == 'm') & ( sheet1['Hb (g/dl)'] <= 17) &  ( sheet1['Hb (g/dl)'] >= 13) & (sheet1['BILIRUBIN (mg/dL)'] >= 0.3) & (sheet1['BILIRUBIN (mg/dL)'] <= 1.2) ].index
final = sheet1.loc[ sheet1.index.isin(indexes), : ]
print(list(final['NAME ']))
Tomer S
  • 900
  • 1
  • 10
  • 21