2

This is my csv file :

CommitId                                RefactoringType      RefactoringDetail
d38f7b334856ed4007fb3ec0f8a5f7499ee2f2b8    Pull Up Attribute   "Pull Up Attribute  protected steps : int from class blokusgame.mi.android.hazi.blokus.GameLogic.PlayerAlgorithm to class blokusgame.mi.android.hazi.blokus.GameLogic.Player"
d38f7b334856ed4007fb3ec0f8a5f7499ee2f2b8    Pull Up Attribute   "Pull Up Attribute  protected steps : int from class blokusgame.mi.android.hazi.blokus.GameLogic.PlayerAlgorithm to class blokusgame.mi.android.hazi.blokus.GameLogic.Player"
d38f7b334856ed4007fb3ec0f8a5f7499ee2f2b8    Pull Up Attribute   "Pull Up Attribute  protected steps : int from class blokusgame.mi.android.hazi.blokus.GameLogic.PlayerAlgorithm to class blokusgame.mi.android.hazi.blokus.GameLogic.Pla

I need to extract this:

RefactoringDetail
"Pull Up Attribute  protected steps : int from class blokusgame.mi.android.hazi.blokus.GameLogic.PlayerAlgorithm to class blokusgame.mi.android.hazi.blokus.GameLogic.Player"
"Pull Up Attribute  protected steps : int from class blokusgame.mi.android.hazi.blokus.GameLogic.PlayerAlgorithm to class blokusgame.mi.android.hazi.blokus.GameLogic.Player"
"Pull Up Attribute  protected steps : int from class blokusgame.mi.android.hazi.blokus.GameLogic.PlayerAlgorithm to class blokusgame.mi.android.hazi.blokus.GameLogic.Player"

I tried this code:

import pandas as pd
df = pd.read_csv('result_refactorings.csv', sep='delimiter', header=None)
df.iloc[:,-1]

it return all the data

Any help please!

Henda Drid
  • 25
  • 4

2 Answers2

1

If you wanted to just use the inbuilt csv module:

import csv
import re
third_column = []
with open("result_refactorings.csv") as csvfile:
    fixed_spaces = [re.sub(" {2,}","\t",x) for x in csvfile]
    reader = csv.DictReader(fixed_spaces, delimiter="\t")
    for row in reader:
        print(row["RefactoringDetail"])
        third_column.append(row["RefactoringDetail"])

This code both prints out the third column and adds each item in the third column to a list third_column.. take out one or the other depending on what you wanna do.

EDIT: On closer inspection it seems your csv input is delimited with an uneven number of spaces.. and not actually tabs, which is what it looks like.. Added a little regex to replace 2 or more concurrent spaces with an actual tab.. since in its current state it isn't a valid csv.

Zhenhir
  • 1,157
  • 8
  • 13
  • thanks but it return this error: KeyError: 'RefactoringDetail' and it's a commun error for all the codes that i tried – Henda Drid May 29 '19 at 07:28
  • Did you try it with the Regex that I edited in? It didn't work before I did that for me.. make sure reader is running on `fixed_spaces`. – Zhenhir May 29 '19 at 07:29
  • Alternately.. can you show me what one row looks like when you just do `print(row)` instead? – Zhenhir May 29 '19 at 07:31
  • it return also an error : Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode? – Henda Drid May 29 '19 at 07:39
  • I pasted your example directly from this page and I didn't need to do that.. There must be some issue that didn't carry across. can you see what happens if you just run `print(open("result_refactorings.csv").readlines()[0])`? – Zhenhir May 29 '19 at 07:58
  • it return this : CommitId;RefactoringType;RefactoringDetail – Henda Drid May 29 '19 at 08:07
0

Pandas is spectacular for dealing with csv files, and the following code would be all you need to read a csv and save an entire column into a variable:

import pandas as pd
df = pd.read_csv('test.csv', sep=';')
refactoring_details = df['RefactoringDetail']
print(refactoring_details)

Edit: separator in provided file is ; instead of the default ,.

Toni Sredanović
  • 2,280
  • 1
  • 11
  • 13