2

I have the following data:

    production_type       type_a       type_b     type_c     type_d     
0             type_a     1.173783    0.714846    0.583621        1
1             type_b     1.418876    0.864110    0.705485        1
2             type_c     1.560452    0.950331    0.775878        1
3             type_d     1.750531    1.066091    0.870388        1
4             type_a     1.797883    1.094929    0.893932        1
5             type_a     1.461784    0.890241    0.726819        1
6             type_b     0.941938    0.573650    0.468344        1
7             type_a     0.507370    0.308994    0.252271        1
8             type_c     0.443565    0.270136    0.220547        1
9             type_d     0.426232    0.259579    0.211928        1
10            type_d     0.425379    0.259060    0.211504        1

I would like to create a new column, list or series to that return the value of the column.

OUTPUT

    production_type       type_a       type_b     type_c     type_d     Results 
0             type_a     1.173783    0.714846    0.583621        1     1.173783    
1             type_b     1.418876    0.864110    0.705485        1     0.864110    
2             type_c     1.560452    0.950331    0.775878        1     0.775878        
3             type_d     1.750531    1.066091    0.870388        1     1
4             type_a     1.797883    1.094929    0.893932        1     1.797883
5             type_a     1.461784    0.890241    0.726819        1     1.461784
6             type_b     0.941938    0.573650    0.468344        1     0.573650

Basically if its written type_a in the column [production_type] I want to return the results of type_a in a column [Results].

Ive tried the following :

for i in df:
    if i == 'type_a':
        print ('type_a')
    elif i == 'type_b':
        print ('type_b')
    elif i == 'type_c':
        print ('type_c')
    elif i == 'type_d':
        print ('type_d')
    else:
        print('')  
    print('')   

using result.append

To generate the dataframe use the following :

list_cols = ['type_a','type_b','type_c']
df = pd.DataFrame(np.random.rand(10, 3), columns = list_cols )
df['production_type']= ['type_a','type_b','type_c','type_d','type_a','type_a','type_b'
                       ,'type_b','type_c','type_d']
 df['type_d'] = 1
 df['results'] = ''

Any hint on where to search?

jpp
  • 159,742
  • 34
  • 281
  • 339
Joe
  • 41
  • 5
  • Can you add this dataframe as numpy arrays using df.values in your question. It would be easier to use that to create dataframe on my machine instead of just typing it. – jat Apr 18 '18 at 09:03
  • 1. Read the csv file, 2. Calculate the results, 3. Write a new csv file. Example: https://stackoverflow.com/a/20336519/1251007 – user1251007 Apr 18 '18 at 09:03
  • Well I don't have your csv file – jat Apr 18 '18 at 09:04
  • There is no csv file. I update the question to make it easier for you to backtest. 2 sec – Joe Apr 18 '18 at 09:09
  • 2
    @jat Ctrl+C the dataframe, then use `df = pd.read_clipboard()` – Georgy Apr 18 '18 at 09:10
  • 1
    @jpp thanks for feedback. df.apply with the lambda really helps. – Joe Apr 19 '18 at 08:12

3 Answers3

2

You can use pd.DataFrame.apply for this:

df['Results'] = df.apply(lambda row: row.get(row['production_type']), axis=1)

Explanation

  • pd.DataFrame.apply with axis=1 applies a function to each row and extracts, via an implicit loop, the components of each row.
  • The method allows an anonymous lambda function as an argument.
  • We can define the lambda function to extract the required value from production_type column.
jpp
  • 159,742
  • 34
  • 281
  • 339
1

You may try

result = list()
index =0
for i in df['production_type']:
    value = df[i][index]
    index = index+1
    result.append(value)

df['Results'] = pd.DataFrame(result)
prabhakar
  • 472
  • 1
  • 4
  • 11
1

You can use map method by passing a lambda function.

df['Results'] = df.index.map(lambda index : df[df['production_type'][index]][index])
Mihai Alexandru-Ionut
  • 47,092
  • 13
  • 101
  • 128