3

Basically, I would like to create a new data frame from some existing data frames by creating all the possible column combinations. This is quite easy in SAS (or expand.grid function in R):

create table combine_var as
select *
from var_1, avar_2;

But I am not sure, what is the equalent way in Python. For instance, my data frame looks like:

var_1= pd.DataFrame.from_items([('val_1', [0.00789, 0.01448, 0.03157])])
var_2= pd.DataFrame.from_items([('val_2', [0.5, 1.0])])

And I expect the output is:

val_1   val_2
0.00789 0.5
0.00789 1.0
0.01448 0.5
0.01448 1.0
0.03157 0.5
0.03157 1.0
TTT
  • 4,354
  • 13
  • 73
  • 123

1 Answers1

4

You could use expand_grid which is mentioned in docs cookbook:

def expand_grid(data_dict):
  rows = itertools.product(*data_dict.values())
  return pd.DataFrame.from_records(rows, columns=data_dict.keys())

expand_grid({'val_1': [0.00789, 0.01448, 0.03157], 'val_2' : [0.5, 1.0]})

In [107]: expand_grid({'val_1': [0.00789, 0.01448, 0.03157], 'val_2' : [0.5, 1.0]})
Out[107]:
     val_1  val_2
0  0.00789    0.5
1  0.00789    1.0
2  0.01448    0.5
3  0.01448    1.0
4  0.03157    0.5
5  0.03157    1.0

EDIT

For existing dataframes you first will need to create one dictionary from your dataframes. You could combine to one with one of the answers to that question. Example for your case:

expand_grid(dict(var_1.to_dict('list'), **var_2.to_dict('list')))

In [122]: expand_grid(dict(var_1.to_dict('list'), **var_2.to_dict('list')))
Out[122]:
     val_1  val_2
0  0.00789    0.5
1  0.00789    1.0
2  0.01448    0.5
3  0.01448    1.0
4  0.03157    0.5
5  0.03157    1.0
Community
  • 1
  • 1
Anton Protopopov
  • 30,354
  • 12
  • 88
  • 93
  • Yes. I saw that code. But it needs to have a dictionary first. Just wondering if there exists a function which directly takes in data frames – TTT Dec 01 '15 at 07:28
  • you could convert them to dict and combine to one and pass to that function – Anton Protopopov Dec 01 '15 at 07:33
  • Thanks for your solution! It seems like at this moment there is no build-in function in 'Pandas' to achieve this... – TTT Dec 01 '15 at 07:36
  • 1
    At least I don't know that. It's probably good to write a feature request for `pandas` package – Anton Protopopov Dec 01 '15 at 07:39