I have a CSV file with hundreds of rows like shown below! How can I merge these two rows in such a way that only second column value gets updated to CVE-2017-3006, CVE-2017-3007 and the rest remains the same
Asked
Active
Viewed 58 times
-2
-
2Does this answer your question? [pandas - Merge nearly duplicate rows based on column value](https://stackoverflow.com/questions/36271413/pandas-merge-nearly-duplicate-rows-based-on-column-value) – vahdet Feb 07 '20 at 14:13
1 Answers
1
import pandas as pd
# Create a dummy dataframe like yours
df = pd.DataFrame([
{"C1": "0", "C2": "A", "C3": "9.0", "C4": "High", "C5": "zaid", "C6": "TCP", "C7": "445", "C8": "some_text", "C9": "some_other_text"},
{"C1": "0", "C2": "B", "C3": "9.0", "C4": "High", "C5": "zaid", "C6": "TCP", "C7": "445", "C8": "some_text", "C9": "some_other_text"},
{"C1": "1", "C2": "A", "C3": "17.0", "C4": "High", "C5": "zaid", "C6": "TCP", "C7": "445", "C8": "some_text", "C9": "some_other_text"},
{"C1": "1", "C2": "B", "C3": "17.0", "C4": "High", "C5": "zaid", "C6": "TCP", "C7": "445", "C8": "some_text", "C9": "some_other_text"},
{"C1": "1", "C2": "C", "C3": "17.0", "C4": "High", "C5": "zaid", "C6": "TCP", "C7": "445", "C8": "some_text", "C9": "some_other_text"},
])
# Group by all columns you want to keep, and aggregate C2 into a list
grouped = df.groupby(["C1","C3","C4","C5","C6","C7","C8","C9"], as_index=False).agg({
"C2":lambda x:x.tolist()
})
print(grouped)
C1 C3 C4 C5 C6 C7 C8 C9 C2
0 0 9.0 High zaid TCP 445 some_text some_other_text [A, B]
1 1 17.0 High zaid TCP 445 some_text some_other_text [A, B, C]

tdpr
- 212
- 1
- 4