0

Like the below R code, how can I simply generate the same data table in Python?

Genotype <- c(rep(c("CV1","CV2", "CV3"), each=9))

Treatment <- c(rep(c("TR1", "TR2", "TR3"), each=3), 
               rep(c("TR1", "TR2", "TR3"), each=3),
               rep(c("TR1", "TR2", "TR3"), each=3))
           
Block <- c(rep(c("B1","B2","B3"), times=9))

yield <- c(rep("15",5), rep("18",5), rep("20",8), rep("14",7), rep ("21",2))

dataA<- data.frame (Genotype, Treatment, Block, yield)
dataA

enter image description here

This is Python code I generated, but I believe more simple way. Could you let me know how to make a simple code, using rep() like in R?

import pandas
from pandas import DataFrame
        
source={'Genotype':["CV1","CV1","CV1","CV1","CV1","CV1","CV1","CV1","CV1","CV2","CV2",
                    "CV2","CV2","CV2","CV2","CV2","CV2","CV2","CV3","CV3","CV3","CV3",
                     "CV3","CV3", "CV3","CV3","CV3"],
'Treatment':["TR1","TR1","TR1","TR2","TR2","TR2","TR3","TR3","TR3","TR1","TR1","TR1","TR2","TR2",
             "TR2","TR3","TR3","TR3","TR1","TR1","TR1","TR2","TR2","TR2","TR3","TR3","TR3"],
'Block':["B1","B2","B3","B1","B2","B3","B1","B2","B3","B1","B2","B3","B1","B2","B3",
         "B1","B2","B3","B1","B2","B3","B1","B2","B3","B1","B2","B3"],
'Yield':[15,15,15,15,15,18,18,18,18,18,20,20,20,20,20,20,20,20,14,14,14,14,14,14,14,21,21]}
        
DataA=DataFrame(source) 
    
DataA

Always thanks!!

Jin.w.Kim
  • 599
  • 1
  • 4
  • 15
  • try, ``["CV1"] * 9 + ["CV2"] * 9...`` & so on.. – sushanth Apr 21 '21 at 17:49
  • I am downvoting this question because there's no indication of what the OP has already tried or where they're stuck. Googling "repeat values python" points the way to several ways to do this, including this duplicate question: https://stackoverflow.com/questions/3459098/create-list-of-single-item-repeated-n-times – Kevin Troy Apr 21 '21 at 17:52

2 Answers2

1
df = pd.DataFrame(
    {
        "Genotype": ["CV1"] * 9 + ["CV2"] * 9 + ["CV3"] * 9,
        "Treatment": (["TR1"] * 3 + ["TR2"] * 3 + ["TR3"] * 3) * 3,
        "Block": ["B1", "B2", "B3"] * 9,
        "yield": [15] * 5 + [18] * 5 + [20] * 8 + [14] * 7 + [21] * 2,
    }
)
print(df)

Prints:

   Genotype Treatment Block  yield
0       CV1       TR1    B1     15
1       CV1       TR1    B2     15
2       CV1       TR1    B3     15
3       CV1       TR2    B1     15
4       CV1       TR2    B2     15
5       CV1       TR2    B3     18
6       CV1       TR3    B1     18
7       CV1       TR3    B2     18
8       CV1       TR3    B3     18
9       CV2       TR1    B1     18
10      CV2       TR1    B2     20
11      CV2       TR1    B3     20
12      CV2       TR2    B1     20
13      CV2       TR2    B2     20
14      CV2       TR2    B3     20
15      CV2       TR3    B1     20
16      CV2       TR3    B2     20
17      CV2       TR3    B3     20
18      CV3       TR1    B1     14
19      CV3       TR1    B2     14
20      CV3       TR1    B3     14
21      CV3       TR2    B1     14
22      CV3       TR2    B2     14
23      CV3       TR2    B3     14
24      CV3       TR3    B1     14
25      CV3       TR3    B2     21
26      CV3       TR3    B3     21
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • 1
    Many thanks!! I just learned Python and thanks to you, I totally understood about this. Thanks again!!! – Jin.w.Kim Apr 21 '21 at 22:32
0

Thank you for people's answers. I solved and share the code. I just started learning Python and my coding is so basic.

import pandas as pd
from pandas import DataFrame

source = {"Genotype": ["CV1"]*9 + ["CV2"]*9 + ["CV3"]*9,
        "Treatment": (["TR1"]*3 + ["TR2"]*3 + ["TR3"]*3)*3,
        "Block": ["B1","B2","B3"]*9,
        "yield": [15]* 5 + [18]* 5 + [20]* 8 + [14]* 7 + [21]* 2}

DataA=DataFrame(source)
DataA
Jin.w.Kim
  • 599
  • 1
  • 4
  • 15