1

I have a python program which give me around 200 csv files with 25 records each. I want to merge these 200 files into one file csv and load it in SQL server. (I am assuming this is good way to load)

My final aim is to have one csv file with all the data of 200 csv and load the data on SQL server as well.

All the files have same columns. One of the column contains ISBN-13 book number. When I merge the file through the following code, the ISBN-13 number gets converted into scientific notation (9780981454221 gets converted to 9.78098145422e+12) and I am losing information. (like the last digit) Is there any way to avoid this. Here is my code and sample data

import pandas as pd
import os
import csv
import glob

os.chdir("//network/My Folder/")
df=pd.DataFrame()
for files in glob.glob("*.csv"):
    print files 
    df = pd.concat([df,pd.read_csv(files)],axis=0)
df.to_csv("test.csv", sep=',', encoding='utf-8',index=False)

Data in csv file

Book    ISBN-13
Book_1  9780262527132
Book_2  9780071495844
Book_3  9780679734031
Book_4  9781621840862
Book_5  9781614271352

I am new to Python and DB. Any suggestions would be appreciated. Thank you in advance!

Neil S
  • 229
  • 7
  • 20

1 Answers1

1

Use dtype=str:

for files in glob.glob("*.csv"):
    print files 
    df = pd.concat([df,pd.read_csv(files, dtype={'ISBN-13':str})],axis=0)
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • Now to load the df into SQL. Trying it for the first time and I know I am going to run into tons of problems. Any suggestions that might help me avoid any common problems? – Neil S Nov 20 '17 at 20:50
  • " to load the df into SQL" - `df.to_sql(...)` - the easiest and by far the most effective way to do that... – MaxU - stand with Ukraine Nov 20 '17 at 20:57
  • I had a 10 min timer before I could accept the answer. It also just records the "Answer useful" feedback but doesnt show my response as my reputation score is less than 15. Will try to see how I can load it to the SQL server with my login credentials and be able to update it periodically. Thank you so much for your help!! – Neil S Nov 20 '17 at 21:03
  • @NeilS, sure! You will find tons of examples of how to use `df.to_sql()` in conjunction with SQL Alchemy on StackOverflow ([example](https://stackoverflow.com/a/45670669/5741205)) - good luck! If you will have any issues and if you can't find an answer on SO - just open a new question – MaxU - stand with Ukraine Nov 20 '17 at 21:06
  • I did ran into a problem [here is the question](https://stackoverflow.com/questions/47402225/python-sqlalchemy-trying-to-write-pandas-dataframe-to-sql-server-using-to-sql) – Neil S Nov 20 '17 at 22:47