0

i need your help to join two data frame of two sector table scrapings

the sample url of many is http://www.mercadopublico.cl/Procurement/Modules/RFB/DetailsAcquisition.aspx?idlicitacion=4593-2-L122

for i in popup_linkz:  # set end at 121` so it will use `120`, if you set end at `120` then it will finish on `80` # eliminate and make the url equal to i to test
    
        url=i
          
        soup = BeautifulSoup(requests.get(i).content, "html.parser")
        licitation_number = soup.select_one("#lblNumLicitacion").text
        responsable = soup.select_one("#lblResponsable").text
        ficha = soup.select_one("#lblFicha2Reclamo").text
        nombre_licitacion=soup.select_one("#lblNombreLicitacion").text
       
        #print(f"{licitation_number=}")
        #print(f"{responsable=}")
        #print(f"{ficha=}")
        #print(f"{nombre_licitacion=}")
        #print(f"#lblFicha1Tipo")
        #print("-" * 80)
        
        for t in soup.select("#grvProducto .borde_tabla00"):
            categoria = t.select_one('[id$="lblCategoria"]').text
            candidad = t.select_one('[id$="lblCantidad"]').text
            descripction = t.select_one('[id$="lblDescripcion"]').text
            #print(f"{categoria=} {candidad=}")
            results.append( (licitation_number, responsable, ficha, nombre_licitacion,categoria,candidad,descripction)) 
            #print()
       
        for z in soup.select("#Ficha1 .tabla_ficha_00"): 
             monto=z.select_one('[id$="lblFicha1Tipo"]').text
             estado=z.select_one('[id$="lblFicha1TituloEstado"]').text
             #comuna=z.select_one('[id$="lblFicha2TituloComuna"]').text
             results2.append( (monto,estado) )
             print('results')   
             print(f"{monto=}")
             
    import pandas as pd
    df1=results
    df2=results2
    df3=pd.merge(results,results2)
    df = pd.DataFrame(data = results[1:],columns = results[0])
    df.to_excel('licitaciones1.xlsx', index=False,header = False)#Writing to Excel file 

i am getting this error

TypeError: Can only merge Series or DataFrame objects, a <class 'list'> was passed

not sure why but im trying to solve but not so good so far...

so if you can help me i would be really glad

results look like these

results

results2 like these

enter image description here

kcomarks
  • 27
  • 5
  • 1
    Your code is not reproducible. `results` and `results2` are lists, they should be DataFrames – mozway Mar 26 '22 at 06:43
  • yeah I know , I'm trying to make them dataframes with no luck df1=results df1 = df1.to_frame().reset_index() df1 = pd.DataFrame (results, columns = ['licitation_number', 'responsable', 'ficha', 'nombre_licitacion','categoria','candidad','descripction']) df2=results2 df2 = df2.to_frame().reset_index() df2 = pd.DataFrame (results2, columns = ['monto', 'estado']) – kcomarks Mar 26 '22 at 06:51
  • Why don't you remove the unnecessary code in your question and simplify it to give a small dummy example of how the data looks like and what you expect to obtain? – mozway Mar 26 '22 at 06:52
  • ok just send the dummy and figure out that they dont have the same range, because the first extraction get more data sometimes from the iteration, i need to figure out how repeat the data for the iteration that last more than one time on the select.one from soup on the firs tone for t – kcomarks Mar 26 '22 at 06:58
  • 1
    Please read [this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – mozway Mar 26 '22 at 08:17
  • thanks, I could get a solution by avoiding merging tables , but will take that in account on the future mozway, thanks a lot for your proactive spirit. – kcomarks Mar 26 '22 at 16:54

1 Answers1

0

just had to extract the unique value before on the first part sorry for the question I will not delete it since maybe is helpulf for someone

url=i

    soup = BeautifulSoup(requests.get(i).content, "html.parser")
    licitation_number = soup.select_one("#lblNumLicitacion").text
    responsable = soup.select_one("#lblResponsable").text
    ficha = soup.select_one("#lblFicha2Reclamo").text
    nombre_licitacion=soup.select_one("#lblNombreLicitacion").text
    monto=soup.select_one("#lblFicha1Tipo").text# here is the answer
    #print(f"{licitation_number=}")
    #print(f"{responsable=}")
    #print(f"{ficha=}")
    #print(f"{nombre_licitacion=}")
    #print(f"#lblFicha1Tipo")
    #print("-" * 80)
kcomarks
  • 27
  • 5