-2

How to split a values in two columns which is inside a round brackets? I have a data frame as mentioned below.

enter image description here Now the problem is length of the first part (before comma) and second part (after comma) is not fix. This may vary.

I want to have the two values inside the round brackets into two separate new columns named “Pos” and “state” .

Can you help me with the python code to implement this.

Below is the what I want to achieve enter image description here

AMC
  • 2,642
  • 7
  • 13
  • 35
  • I tried regex and split . Split is giving length issue and regex is also not working as expected. – Rachit Gupta Apr 12 '20 at 01:07
  • Does this answer your question? [How to unpack a Series of tuples in Pandas?](https://stackoverflow.com/questions/22799300/how-to-unpack-a-series-of-tuples-in-pandas) – mcskinner Apr 12 '20 at 01:09
  • What exactly is the issue? Please see [mcve], [ask], [help/on-topic]. – AMC Jun 26 '20 at 01:16

3 Answers3

0

Parenthesis, or as you call round-brackets are representative of a datatype called Tuples in Python.

If it is static, access arrays of tuples in multiple ways. Here is an easy way

arr = [(5,5), (6,7)]
listOfFirstItems, listOfSecondItems = zip(*arr)
# listOfFirstItems = [5,6]
# listOfSecondItems = [5,7]

So, now I am not 100% sure of your datastructure, but you are able to add these items as needed.

Fallenreaper
  • 10,222
  • 12
  • 66
  • 129
0
df['pos'] = df.Sentiment.str.split(',')[0]
df['state'] = df.Sentiment.str.split(',')[1]
  • 2
    Please don't post only code as an answer, but also provide an explanation what your code does and how it solves the problem of the question. Answers with an explanation are usually of higher quality, and are more like to attract upvotes. – Mark Rotteveel Apr 12 '20 at 07:42
0

First import pandas and read csv through it and store the data into a dataframe object.

Use the .str.split method to split the "Sentiment" column into two by the comma.

Then make new columns, remove the front and back brackets of the string value if any using .str.strip.

Print the data. Or if you would like, write it to a new csv file using the .to_csv method.

Remember to rename the csv file names in the .read_csv and .to_csv methods.

Complete Code:

import pandas as pd

# reading csv
data = pd.read_csv("file.csv")

# new data frame with split value columns 
splitData = data["Sentiment"].str.split(",", n=1, expand=True)

# making new column Pos from first part of the split data, 
# also remove front and back brackets if any
data["Pos"] = splitData[0].str.strip("()")

# making new column state from second part of the split data, 
# also remove front and back brackets if any
data["state"] = splitData[1].str.strip("()")

# print data
print(data)

# write back to a new csv file
data.to_csv('newFile.csv')

Below are the outputs using mock data:

Print to Terminal: enter image description here

New CSV: enter image description here

hiew1
  • 1,394
  • 2
  • 15
  • 23