0

I've tried two codes so far for removing punctuation from a list (made from a csv file):

first bit of code:

import csv
import re
import pandas as pd
articles = pd.read_csv('Guardian_Syria_text.csv', sep='delimiter', header=None)
file = open("Guardian_Syria_text.csv", mode="r", encoding='utf-8-sig')
data = list(csv.reader(file, delimiter=","))
file.close

1st option removing punctuation:

def remove_punc(string):
  punc = '''!()-[]{};:'"\, <>./?@#$%^&*_~'''
  for ele in string:
    if ele in punc:
      string = string.replace(ele, "")
  return string
data = [remove_punc(i) for i in data]

2nd option removing punctuation:

import string
new_list = []
for word in data:
  for character in word:
    if character in string.punctuation:
      word = word.replace(character, "")
  new_list.append(word)

In both cases, data still contained commas and other punctuation marks after running the code successfully. It also inserts an escape character () before apostrophes ('), as in "UK's".

The first code also gives me a DeprecationWarning: invalid escape sequence , But I can't find a solution to that.

Linda Brck
  • 71
  • 6

0 Answers0