1

I have to update the database with the CSV files. Consider the database table looks like this: database table

The CSV file data looks like this:

enter image description here

As you can see the CSV file data some data modified and some new records are added and what I supposed to do is to update only the data which is modified or some new records which are added.

In Table2 the first record of col2 is modified.. I need to update only the first record of col2(i.e, AA) but not the whole records of col2.

I could do this by hardcoding but I don't want to do it by hardcoding as I need to do this with 2000 tables.

Can anyone suggest me the steps to approach my goal.

Here is my code snippet..

df = pd.read_csv('F:\\filename.csv', sep=",", header=0, dtype=str)

sql_query2 = engine.execute('''
                               SELECT
                               *
                               FROM ttcmcs023111temp
                               ''')

df2 = pd.DataFrame(sql_query2)
df.update(df2)
User
  • 37
  • 1
  • 9

1 Answers1

0

Since I do not have data similar to you, I used my own DB. The schema of my books table is as follows:

+--------+-------------+------+-----+---------+-------+
| Field  | Type        | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| id     | int(11)     | NO   | PRI | NULL    |       |
| name   | varchar(30) | NO   |     | NULL    |       |
| author | char(30)    | NO   |     | NULL    |       |
+--------+-------------+------+-----+---------+-------+

And the table looks like this:

+----+--------------------+------------------+
| id | name               | author           |
+----+--------------------+------------------+
|  1 | Origin             | Dan Brown        |
|  2 | River God          | Wilbur Smith     |
|  3 | Chromosome 6       | Robin Cook       |
|  4 | Where Eagles Dare  | Alistair Maclean |
|  5 | The Seventh Scroll | Dan Brown        |  ### Added wrong entry to prove 
+----+--------------------+------------------+  ### my point  

So, my approach is to create a new temporary table with the same schema as the books table from the CSV using python. The code I used is as follows:

sql_query = sqlalchemy.text("CREATE TABLE temp (id int primary key, name varchar(30) not null, author varchar(30) not null)")
result = db_connection.execute(sql_query)
csv_df.to_sql('temp', con = db_connection, index = False, if_exists = 'append')

Which creates a table like this:

+----+--------------------+------------------+
| id | name               | author           |
+----+--------------------+------------------+
|  1 | Origin             | Dan Brown        |
|  2 | River God          | Wilbur Smith     |
|  3 | Chromosome 6       | Robin Cook       |
|  4 | Where Eagles Dare  | Alistair Maclean |
|  5 | The Seventh Scroll | Wilbur Smith     |
+----+--------------------+------------------+

Now, you just need to use the update in MySQL using INNER JOIN to update the values you want to update in your original table. (in my case, 'books').

Here's how you'll do this:

statement = '''update books b
inner join temp t
on t.id = b.id
set b.name = t.name,
b.author = t.author;
'''
db_connection.execute(statement)

This query will update the values in table books from the table temp that I've created using the CSV.

You can destroy the temp table after updating the values.

Vishal A.
  • 1,373
  • 8
  • 19
  • Thanks @VishalA and the same thing i need to do with multiples tables like around 2000 tables simultaneously, if you had any solution or clue please share to me . – User Dec 13 '21 at 14:16
  • If every table has a similar schema, you can perform the actions inside a for loop and destroy the `temp` at the end of every iteration. – Vishal A. Dec 13 '21 at 14:39
  • statement = '''update books b inner join temp t on t.id = b.id set b.name = t.name, b.author = t.author; ''', here is where i feel complicated to put inside a loop as every table has its different column names – User Dec 13 '21 at 14:44
  • Create a list of column names and pass these values as variables to the SQL query. You can get some idea form here: https://stackoverflow.com/questions/902408/how-to-use-variables-in-sql-statement-in-python – Vishal A. Dec 13 '21 at 14:51
  • let me try as you said – User Dec 13 '21 at 15:06
  • As i am new to this programming, i am not getting how to do this, can you please help me out in this @VishalA. – User Dec 14 '21 at 04:06