0

I have a dataframe imported from a csv file using Pandas read_csv. Its shape is 735, 36. I need to drop the last 33 columns - keeping the first 3. The columns are 'code','proc', 'All procedures'

I have tried all the suggestions here

Whatever I do I get the following error:

TypeError: 'bool' object is not subscriptable

For example:

df1=df[['code','proc', 'All procedures']]
TypeError                                 Traceback (most recent call last)
<ipython-input-37-350994f9b7c6> in <module>
----> 1 df[['code','proc', 'All procedures']]

TypeError: 'bool' object is not subscriptable

I have started again. The ‘bool object is not subscriptable’ error has gone away, I think df had been overwritten.

I am trying to use some publicly available data on hospital activity and extract data from it.I am a neurosurgeon so you may have to be patient. The data is here https://files.digital.nhs.uk/77/0C8B3F/hosp-epis-stat-admi-proc-2018-19-tab.xlsx

I want to extract the first three columns of the CSV in the code below, and output as excel.

My new problem is that I can’t extract the columns 'proc' and ‘All procedures’.

Here is my working

import matplotlib.pyplot as plt
import pandas as pd
import pygal
import os
import webbrowser

This imports one tab of the spreadsheet which I have converted to csv and renamed

df = pd.read_csv('neuro_spine_craino_just_all4.csv')
df.head(5) 
code    proc    All procedures  Main procedure  Male    Female  Gender Unknown  Mean age    Age 0   Age 1-4 ... Age 65-69   Age 70-74   Age 75-79   Age 80-84   Age 85-89   Age 90+ Day case    Emergency   Elective    Other

0 A01.1 Hemispherectomy 20 20 8 12 0 11.0 0 7 ... 0 0 0 0 0 0 0 0 0 0 1 A01.2 Total lobectomy of brain 53 53 37 16 0 40.0 1 1 ... 4 4 1 0 0 0 0 1 0 0 2 A01.3 Partial lobectomy of brain 174 148 95 79 0 41.0 1 5 ... 12 14 3 1 0 0 0 1 1 0 3 A01.8 Other specified major excision of tissue of brain 20 15 12 8 0 34.0 1 0 ... 0 0 0 0 0 0 0 0 0 0 4 A01.9 Unspecified major excision of tissue of brain 3 3 0 3 0 39.0 0 0 ...

df.info

code proc All procedures
\ 0 A01.1 Hemispherectomy 20
1 A01.2 Total lobectomy of brain 53
2 A01.3 Partial lobectomy of brain 174
3 A01.8 Other specified major excision of tissue of brain 20
4 A01.9 Unspecified major excision of tissue of brain 3

df.columns

Index(['code', 'proc', 'All procedures', 'Main procedure', 'Male', 'Female', 'Gender Unknown', ' Mean age ', 'Age 0', 'Age 1-4', 'Age 5-9', 'Age 10-14', 'Age 15', 'Age 16', 'Age 17', 'Age 18', 'Age 19', 'Age 20-24', 'Age 25-29', 'Age 30-34', 'Age 35-39', 'Age 40-44', 'Age 45-49', 'Age 50-54', 'Age 55-59', 'Age 60-64', 'Age 65-69', 'Age 70-74', 'Age 75-79', 'Age 80-84', 'Age 85-89', 'Age 90+', 'Day case', 'Emergency', 'Elective', 'Other'], dtype='object')

df['code'], ['proc'], ['All procedures']

This will only give me the first column and index.

(0 A01.1 1 A01.2 2 A01.3 3 A01.8 4 A01.9 5 A02.1 6 A02.2 7 A02.3 8 A02.4 9 A02.5 10 A02.6 11 A02.7 12 A02.8 13 A02.9 14 A03.1 15 A03.2 16 A03.3

capnahab
  • 343
  • 3
  • 14
  • 2
    What is `df`, from the error it seems is a boolean. – Dani Mesejo Oct 16 '19 at 20:34
  • 1
    Can you set up an example to show this failing on https://repl.it? Doesn't look like `df` is actually of Type DataFrame – Nick Martin Oct 16 '19 at 20:34
  • Specifically, we need a snippet of how you create df with dummy data. Not the whole df = pd.read_csv(...) but a sample that includes data, like `df = pd.DataFrame({'col1': [1,2,3], 'col2': [4,5,6]})`. I would refresh your notebook and start from scratch in case `df` got overwritten at some point executing other cells. – mayosten Oct 16 '19 at 23:52

1 Answers1

0

To extract those columns from the dataframe you can do either

result = df[['code', 'proc', 'All procedures']]

or

result = df.loc[:, ['code', 'proc', 'All procedures']]

See pandas docs here for info on this - known as slicing

Your issue was not including the columns in one list passed into the dataframe - only 'code' was passed into the actual df selector

NickHilton
  • 662
  • 6
  • 13