1

I have written the following code to create a dataframe, and add new rows and columns based on a certain conditions. Unfortunately, it takes a lot of time to execute.

Are there any alternate ways to do this? Any inputs are highly appreciated.

dfCircuito=None
for index, row in dadosCircuito.iterrows():
    for mes in range(1,13):
        for nue in range(1,5):
            for origem in range (1,3):
                for suprimento in range (1,3):
                    for tipo in range (1,3):

                        df=pd.DataFrame(dadosCircuito.iloc[[index]])
                        df['MES']=mes
                        if(nue==1):
                            df['NUE']='N'
                        elif(nue==2):
                            df['NUE']='C'
                        elif(nue==3):
                            df['NUE']='F'
                        else:
                            df['NUE']='D'

                        if(origem==1):
                            df['Origem']='DISTRIBUICAO'
                        else:
                            df['Origem']='SUBTRANSMISSAO'


                        if(suprimento==1):
                            df['Suprimento']='INTERNO'
                        else:
                            df['Suprimento']='EXTERNO'

                        if(tipo==1):
                            df['TipoOcorrencia']='EMERGENCIAL'
                        else:
                            df['TipoOcorrencia']='PROGRAMADA'


            dfCircuito=pd.concat([dfCircuito, df], axis=0) ```

Vandan Revanur
  • 459
  • 6
  • 17
  • 2
    Welcome to stack overflow! Please have a look at [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and provide a sample of your input and your preferred output so that we can better understand your problem – G. Anderson Jan 29 '20 at 20:01
  • Without knowing the size of your dataframe, I would look first to reducing the nested `for` loops if possible to improve performance. – Matthew Barlowe Jan 29 '20 at 20:13
  • gives an input and output examples to better understand your question – Shubham Shaswat Jan 29 '20 at 20:14

1 Answers1

1

If I understand you correctly, you are trying to add a number of rows per row of dadosCircuito. The extra rows are permutations of mes=1...12; nue=N,C,F,D; ...

You can create a dataframe containing the permutations of attributes, then join it back to dadosCircuito:

mes = range(1,13)
nues = list('NCFD')
origems = ['DISTRIBUICAO', 'SUBTRANSMISSAO']
suprimentos = ['INTERNO', 'EXTERNO']
tipos = ['EMERGENCIAL', 'PROGRAMADA']

# Make sure dadosCircuito.index is unique. If not, call a reset_index
# dadosCircuito = dadosCircuito.reset_index()
df = pd.MultiIndex.from_product([dadosCircuito.index, mes, nues, origems, suprimentos, tipos], names=['index', 'MES', 'NUE', 'Origem', 'Suprimento', 'TipoOcorrencia']) \
        .to_frame(index=False) \
        .set_index('index')

dfCircuito = dadosCircuito.join(df)
Code Different
  • 90,614
  • 16
  • 144
  • 163