1

I am a Python beginner and am currently trying to delete some columns in a csv - worked well! But: When I use pandas, it adds another column automatically at the beginning of the file and adds row numbers. How can I avoid that?

Input data is as follow (json):

    [
      {
        "source": "twitter",
        "cashtag": "$FB",
        "sentiment score": "0.366",
        "id": "719659409228451840",
        "spans": [
          "watching for bounce tomorrow"
    ]
  }, ... ]

Converting to csv worked well.

My code for doing this:

import pandas as pd

# Convert son to csv
pd.read_json("test.json").to_csv("test.csv")

# Delete cashtag, id, source column
data = pd.read_csv("test.csv")
data = data.drop(["cashtag", "id", "source"], axis=1)
data.to_csv("test_cleaned.csv")
data.head()

Output:

Unnamed: 0  sentiment score spans
0   0   0.366   ['watching for bounce tomorrow']
1   1   0.638   ['record number of passengers served in 2015']
2   2   -0.494  ['out $NFLX -.35']
3   3   0.460   ['Looking for a strong bounce', 'Lunchtime 
4   4   0.403   ['Very intrigued with the technology and 

What I want to have:

sentiment score spans
    0.366   ['watching for bounce tomorrow']
    0.638   ['record number of passengers served in 2015']
    -0.494  ['out $NFLX -.35']
    0.460   ['Looking for a strong bounce', 'Lunchtime 
    0.403   ['Very intrigued with the technology and 

So converting and deletion worked well, but for each operation using pandas, it adds another column at the beginning of the file. In this example 1 column after converting to csv and 1 column after deleting columns. How can I avoid this?

Fabs
  • 43
  • 4
  • 1
    The pandas online documentation is a great resource. When trying to figure out how a function works, this is a great place to start. – lmo Jun 09 '19 at 14:38

2 Answers2

2

That’s called an index, you can prevent it being written by using the following

df.to_csv(‘FileMaker.csv’, index=False)
pypypy
  • 1,075
  • 8
  • 18
2

The column you're referring to is the index. Try doing this when saving your csv:

data.to_csv("test_cleaned.csv", index=False)

Pandas automatically creates an index for each Dataframe you initialize, unless you do it explicitly. I highly recommend giving a read to panda's documentation to get more info.

Pedro Martins de Souza
  • 1,406
  • 1
  • 13
  • 35