27

How can I drop or disable the indices in a pandas Data Frame?

I am learning the pandas from the book "python for data analysis" and I already know I can use the dataframe.drop to drop one column or one row. But I did not find anything about disabling the all the indices in place.

GeauxEric
  • 2,814
  • 6
  • 26
  • 33

6 Answers6

20

df.values gives you the raw NumPy ndarray without the indexes.

>>> df
   x   y
0  4  GE
1  1  RE
2  1  AE
3  4  CD
>>> df.values
array([[4, 'GE'],
       [1, 'RE'],
       [1, 'AE'],
       [4, 'CD']], dtype=object)

You cannot have a DataFrame without the indexes, they are the whole point of the DataFrame :)

But just to be clear, this operation is not inplace:

>>> df.values is df.values
False

DataFrame keeps the data in two dimensional arrays grouped by type, so when you want the whole data frame it will have to find the LCD of all the dtypes and construct a 2D array of that type.

To instantiate a new data frame with the values from the old one, just pass the old DataFrame to the new ones constructor and no data will be copied the same data structures will be reused:

>>> df1 = pd.DataFrame([[1, 2], [3, 4]])
>>> df2 = pd.DataFrame(df1)
>>> df2.iloc[0,0] = 42
>>> df1
    0  1
0  42  2
1   3  4

But you can explicitly specify the copy parameter:

>>> df1 = pd.DataFrame([[1, 2], [3, 4]])
>>> df2 = pd.DataFrame(df1, copy=True)
>>> df2.iloc[0,0] = 42
>>> df1
   0  1
0  1  2
1  3  4
Viktor Kerkez
  • 45,070
  • 12
  • 104
  • 85
  • Thank you. What I did is to initiate a new dataframe with the values of the old dataframe. – GeauxEric Aug 17 '13 at 15:08
  • I think what I really want to do is to write the data to a file without the indices, and that can be easily done by setting index=False. Sorry I did not make my question clear in the first place. Your answer is very intuitive. – GeauxEric Aug 17 '13 at 16:57
  • What do you mean by "whole point of dataframe is the index". I'm a spark developer and the spark dataframe is for my purposes (data manipulation) more powerful than pandas yet it has no index – WestCoastProjects Aug 31 '22 at 06:01
4
d.index = range(len(d))

does a simple in-place index reset - i.e. it removes all of the existing indices, and adds a basic integer one, which is the most basic index type a pandas Dataframe can have.

naught101
  • 18,687
  • 19
  • 90
  • 138
2

Additionally, if you are using the df.to_excel function of a pd.ExcelWriter, which is where it is written to an Excel worksheet, you can specify index=False in your parameters there.

create the Excel writer:

writer = pd.ExcelWriter(type_box + '-rules_output-' + date_string + '.xlsx',engine='xlsxwriter')  

We have a list called lines:

# create a dataframe called 'df'
df = pd.DataFrame([sub.split(",") for sub in lines], columns=["Rule", "Device", "Status"]))

#convert df to Excel worksheet
df.to_excel(writer, sheet_name='all_status',**index=False**)
writer.save()
user1438038
  • 5,821
  • 6
  • 60
  • 94
Jason Sprong
  • 119
  • 1
  • 1
1

I was having a similar issue trying to take a DataFrame from an index-less CSV and write it back to another file.

I came up with the following:

import pandas as pd
import os

def csv_to_df(csv_filepath):
    # the read_table method allows you to set an index_col to False, from_csv does not
    dataframe_conversion = pd.io.parsers.read_table(csv_filepath, sep='\t', header=0, index_col=False)
    return dataframe_conversion

def df_to_excel(df):
    from pandas import ExcelWriter
    # Get the path and filename w/out extension
    file_name = 'foo.xlsx'
    # Add the above w/ .xslx
    file_path = os.path.join('some/directory/', file_name)
    # Write the file out
    writer = ExcelWriter(file_path)
    # index_label + index are set to `False` so that all the data starts on row
    # index 1 and column labels (called headers by pandas) are all on row index 0.
    df.to_excel(writer, 'Attributions Detail', index_label=False, index=False, header=True)
    writer.save()
matthew.
  • 176
  • 5
0

I have a function that may help some. I combine csv files with a header in the following way in python:

    def combine_csvs(filedict, combined_file):
        files = filedict['files']
        df = pd.read_csv(files[0])
        for file in files[1:]:
            df = pd.concat([df, pd.read_csv(file)])
        df.to_csv(combined_file, index=False)
        return df

It can take as many files as you need. Call this as:

    combine_csvs(dict(files=["file1.csv","file2.csv", "file3.csv"]), 'output.csv')

Or if you are reading the dataframe in python as:

    df = combine_csvs(dict(files=["file1.csv","file2.csv"]), 'output.csv')

The combine_csvs fucntion does not save the indices. If you need the indices use 'index=True' instead.

Sudipta Basak
  • 3,089
  • 2
  • 18
  • 14
0

Just set the indices to blank:

import numpy as np
import pandas as pd

data      = np.zeros([4,2])
row_index = np.array(["","","",""])
col_index = ["colA", "colB"]
table     = pd.DataFrame(data,index = row_index , columns=col_index)
print(f'Table: \n{table}')

Output:

Table: 
  colA  colB
   0.0   0.0
   0.0   0.0
   0.0   0.0
   0.0   0.0
wayne
  • 11
  • 2