Original title:Pandas/Python Drop columns base on syntax ':' existence in cell
Since the the solution went a totally different approach I changed the title
============
Consider a dataframe from txt file.
Data txt file: txt
I removed the unwanted lines that contains "Iteration"
import pandas as pd
data = pd.read_csv('results.txt', sep=" ", header=None)
data = data[~data[1].astype(str).str.startswith('Iteration')]
Now its like this: pd
I wanted to remove all the columns that contains ":" to result something like: {2, 7}{2, 7}{2, 7}{2, 7}{1, 2, 7}{1, 2, 6, 7} {1, 2, 6, 7}{1, 2, 3, 6, 7}{1, 2, 3, 6, 7} {1, 3, 4, 6, 7}{1, 3, 4, 5, 6} {3, 4, 5, 6, 7}{1, 3, 4, 5, 7} {1, 4, 5, 6, 7}, .......etc.
and to merge columns such as: "{2," " 7}," into one column "{2,7}" Then I can replace all the column names with timestamps (6:30, 7:00 etc) to make it a proper table like :
6:30 7:00 7:30 8:00 8:30 9:00
{2,7} {2,7} {2,7} {2,7} {1,2,7} {1,2,6,7}
{5,6} {5,6} {5,6} {5,6} {1,5,6} {1,4,5,6}
....
My first step now is to remove columns with ":" , I tried these:
data.loc[:, ~(data[0:122].astype(str).apply(lambda x: x.str.contains(':'))).any()]
data.loc[:, ~(data.contains(':')).any()]
obviously they won't work since contains() cannot be use on data frames, I am not very familier with pandas. I searched for many other answers and wasn't able to get it right. I wonder what is the correct way to remove columns with cells that contains ":"
Or if my approch full of unnecessary steps, its there a better approach to turn the txt file into a dataframe of
6:30 7:00 7:30 8:00 8:30 9:00
{2,7} {2,7} {2,7} {2,7} {1,2,7} {1,2,6,7}
{5,6} {5,6} {5,6} {5,6} {1,5,6,} {1,4,5,6}
like sort it when reading the txt file at first place and loop through lines and make dictionaries instead? '6:30': {2, 7} ....