I am new to python so sorry if this is too obvious.
I have a dataframe that looks like below:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5, 10))
df.columns = ['date1', 'date2', 'date3', 'name1', 'col1', 'col2', 'col3', 'name2', 'date4', 'date5']
date1 date2 date3 name1 col1 col2 col3 \
0 -0.177090 0.417442 -0.930226 0.460750 1.062997 0.534942 -1.082967
1 -0.942154 0.047837 -0.494979 2.437469 -0.446984 0.709556 -0.135978
2 -1.544783 0.129307 -0.169556 -0.890697 2.650924 0.976610 0.290226
3 -0.651220 -0.196342 0.712601 0.641927 -0.009921 -0.038450 0.498087
4 -0.299145 -1.407747 1.914364 0.554330 -0.196702 2.037057 -0.287942
name2 date4 date5
0 -0.318310 0.358619 -0.243150
1 1.171024 0.277943 -1.584723
2 -0.546707 -1.951831 0.678125
3 -0.510261 -0.018574 -0.212684
4 1.929841 0.995625 -1.125044
I'd like to to keep all columns that have, for example, 'date' in their names. That is, I want to keep columns 'date1', 'date2', 'date3', 'date4', 'date5', etc. In some statistical packages I can use * to represent all possible characters and use a command like this:
keep date*
Is there an equivalent way of doing this in python?
Thanks very much for any help.