1

I am using Python 2.7 to extract information from webpages (with BeautifulSoup), sort that information and insert it into a SQL table (with MySQLdb). That is quite simple, and I was able to find great reference from other places on SO.

My issue is I need code to search the table and if the information already exists, update it. But if the information does not exist, create a new record.

Example:

Table
+-----------+----------+--------+
|   Name    |   Phone  |  Date  |
+-----------+----------+--------+
|   John    | 344-7989 |   9/1  |
+-----------+----------+--------+
| Alexander | 198-3333 |  8/16  |
+-----------+----------+--------+

The web-crawler finds new information online and stores it into a list of dictionaries:

[ 
   {
    "Name" : "Samantha",
    "Phone" : "788-3443",
    "Date" : "9/5"
   }, {
    "Name" : "John",
    "Phone" : "222-9009",
    "Date" : "9/5"
   }
]

Now, one record (Samantha) is completely new, does not exist in the table. But the other record (John), already exist but his information is more recent. What is the code for editing the table with this information and creating:

Final Table
+-----------+----------+--------+
|   Name    |   Phone  |  Date  |
+-----------+----------+--------+
|   John    | 222-9009 |   9/5  |
+-----------+----------+--------+
| Alexander | 198-3333 |  8/16  |
+-----------+----------+--------+
| Samantha  | 788-3443 |   9/5  |
+-----------+----------+--------+
Jacob Bridges
  • 725
  • 2
  • 8
  • 16
  • What do you want, the MySQL code or the Python code? – Miquel Sep 05 '13 at 20:14
  • I believe the MySQL code. From what I understand, you just put the MySQL code into `(MySQLdb.connect).cursor()`. If someone could give me a working example with both SQL code **and** Python code, that would be great. – Jacob Bridges Sep 05 '13 at 20:17
  • But, can you have duplicated names on your list, for example, two Johns? If so, how do you identify the register to be updated? – Miquel Sep 05 '13 at 20:19
  • I thought someone might mention this, should have addressed it in the question. Remember that this is _sample_ data. In the real data if two entries are the same, they are duplicates. No two entries should be alike. (Sorted under a particular column, of course) – Jacob Bridges Sep 05 '13 at 20:23

1 Answers1

2

In this link there is a very good answer to a similar issue.

Adapting it to your context, this could be an example code:

CREATE TABLE Users (`name` varchar(35) unique, `phone` varchar(10), `date` date);

And an idea for the python code (using MySQLdb):

...
cursor = connection.cursor()
cursor.execute("""
    INSERT INTO Users (name, phone, date)
    VALUES (%s,%s,%s)
    ON DUPLICATE KEY UPDATE phone = %s, date=%s;
    """,(name,phone,date,phone,date))
cursor.close()
Community
  • 1
  • 1
Miquel
  • 858
  • 11
  • 20