0

I have this script that takes reads a csv file and takes a few columns from it then writes a new file with that information. One of the columns is URLs but they are not the whole url(/test/one.aspx). I want to add https://www.example.com to each row in this column[4 image_link] so the end result will be a full url(https://www.example.com/test/one.aspx)

here is my code

import xlsxwriter
import pandas as pd

ed = pd.read_csv('/home/google_files/google_file.csv')
df = pd.DataFrame(ed[['id', 'title', 'description', 'product_type', 'link', 'image_link', 'condition', 'availability','price', 'brand', 'gtin','mpn']])        
df.to_csv('/home/csv/google_files/connexity_file.csv', index=False)
    
toxl = pd.ExcelWriter('/home/csv/google_files/conn_file.xlsx') 
df.to_excel(toxl, index = False)   
toxl.save()      
robothead
  • 303
  • 2
  • 10

1 Answers1

0
import xlsxwriter
import pandas as pd

ed = pd.read_csv('/home/google_files/google_file.csv')
df = pd.DataFrame(ed[['id', 'title', 'description', 'product_type', 'link', 'image_link', 'condition', 'availability','price', 'brand', 'gtin','mpn']])   
df['image_link'] = df['image_link'].apply(lambda x : "https://www.example.com" + str(x) )
df.to_csv('/home/csv/google_files/connexity_file.csv', index=False)
    
toxl = pd.ExcelWriter('/home/csv/google_files/conn_file.xlsx') 
df.to_excel(toxl, index = False)   
toxl.save() 

removed the join...

Paul Brennan
  • 2,638
  • 4
  • 19
  • 26
  • 1
    I tried what you added this is what gets written to the new file: /https://www.example.com7https://www.example.com0https://www.example.com0https://www.example.com6https://www.example.com7https://www.example.com6https://www.example.com.https://www.example.comahttps://www.example.comshttps://www.example.comphttps://www.example.comx – robothead Nov 30 '20 at 17:42