0

New to sql and postgres so sorry if this turns out to be a noobish question.

I have two tables:

CREATE TABLE    usertable (  
user_id VARCHAR PRIMARY KEY,
...);

CREATE TABLE hasFriend(
userID          VARCHAR,
friendID            VARCHAR ,
PRIMARY KEY (userID,friendID),
FOREIGN KEY (userID) REFERENCES usertable(user_id),
FOREIGN KEY (friendID) REFERENCES usertable(user_id));

Now I'm trying to run a python script to parse through a large json file that populates the usertable and friendtable. The parsing and insertion fun for user works correctly and closes the json file after finishing. After this function finishes, the friend function is called and the json is reopened from the top of the json file. Here is the code for the friend function:

def insert2FriendTable():
    with open('yelp_user.JSON','r') as f:    
        line = f.readline()
        count_line = 0

        try:
             #edited out the connect statement since it has personal info on it.
            conn = psycopg2.connect(...)
        except:
            print('Unable to connect to the database!')
        cur = conn.cursor()
        while line:
            data = json.loads(line)
            for friend in data['friends']:
                sql_str="INSERT INTO hasFriend (userID, friendID) VALUES ('" + data['user_id']  + "','" + friend + "');"
                try:
                    cur.execute(sql_str)
                except Exception as error:
                    print("Insert to hasFriend failed!", error)
                    return

When I run this it runs into the error:

Insert to hasFriend failed! duplicate key value violates unique constraint "hasfriend_pkey"
DETAIL:  Key (userid, friendid)=(om5ZiponkpRqUNa3pVPiRg, U_sn0B-HWdTSlHNXIl_4XA) already exists.

Here is an example row from the json file. This is the first row on the file so friendfunction is crashing on that first error by looking at the id's from the error

{"average_stars": 3.94, "compliment_cool": 1556, "compliment_cute": 211, "compliment_funny": 1556, "compliment_hot": 1285, "compliment_list": 101, "compliment_more": 134, "compliment_note": 1295, "compliment_photos": 162, "compliment_plain": 2134, "compliment_profile": 74, "compliment_writer": 402, "cool": 40110, "elite": [2014, 2017, 2011, 2012, 2015, 2009, 2013, 2007, 2016, 2006, 2010, 2008], "fans": 835, "friends": ["U_sn0B-HWdTSlHNXIl_4XA", "pnfVIB7UhvCQ7X2K0Q2XIw", "jVYzrVblDFSuL3GHtt8ZSA", "Z7bpqY89ZiBHXdo7UN1kiw", "8Aqr35f254lOeitNowt7ig", "zjcN27kCVeK8K2ONe9Qt4g", "8drMKNHWavs2g6uf0pLtvg", "_K2ViyfmVq6nzIitR0TIlg", "rUV1FUhji5xMjNBpcq5SXg", "yrGIgk5eaWy-eewLNv4KHQ", "3Vd_ATdvvuVVgn_YCpz8fw", "ebC_pH92K4uxyDenoXb5bg", "RJrGgtBXkpX2oEHM4hSqXg", "sdpIz4-s15T239CZ4Bd6Ag", "A0j21z2Q1HGic7jW6e9h7A", "AvC5XQAElcGAAn_Wr5auEg", "JlkHKBnHKdK8Tpls0AF5Aw", "8AG5MctcxTjP4svmUrt0yQ", "bKxdvn7KpmWjMzlmBvp-Xw", "VVMS74JyUk2h53yfC-xNsA", "-ro7OG3jjCSKnF6OJinKjg", "K7thO1n-vZ9PFYiC7nTR2w", "pRBzWnFzaCEtqhYyJ2ZTDQ", "nwESZ8e-KzXt2fKkOuRdIQ", "WNZfkL4DBspueoGSUOMAqA", "uU6fQWadr7Hx_MP0Vmy3kQ", "XiLxIJThWsE0x4d0IeSPsg", "Y9LBTbwO4g0BmdBIi0D3CA", "jGbj8fl575EIQJcfaA1FKQ", "nxWrhF_hyX0wwjrEkQX8uQ", "670k6Gr6V4VqLIKtVEmDuQ", "o5STsEtfvD1Ig0J7Z-1uxA", "x13yoEggBL0pIE7-KMnhDQ", "rCx7tb3toOJUsvdOeqYY0g", "nkN_do3fJ9xekchVC-v68A", "KzHRsFwryS7b5Fog8kkkGA", "_pBzBgtCTN9PNUPfgPDI8A", "PU5QaMADa6N_9ZoQ04ZjOw", "FkfpHzqoDRChwOYhA6NPnQ", "CaQy-zz10ajG7KkNSbXi5w", "brQ7OjB6f9nXWGk45A9A3g"], "funny": 10882, "name": "Andrea", "review_count": 2559, "useful": 83681, "user_id": "om5ZiponkpRqUNa3pVPiRg", "yelping_since": "2006-01-18"}

I have no earthly idea how to fix this. Any advice is helpful. Thanks in advance!

Ely Meyers
  • 11
  • 5
  • 1
    It's your code, so what *do* you want your code to do if it reads a line that says "'A' has friend 'B'" but there's already a `hasFriend` line in the db for that combination? Setting that primary key and using `INSERT` tells PostgreSQL that you want an error to be thrown if that happens. You're complaining about the error, so you must want something different to happen? – Dan Getz Jul 26 '22 at 00:11
  • Relatedly, right before you run `insert2FriendTable()`, is the friend table empty or not, and does that match your intentions? Does the JSON file have any duplicate rows? – Dan Getz Jul 26 '22 at 00:16
  • You created the unique constraint because of some requirement. When your data violates this requirement, what should the application do? You are the only person who knows this answer. – Frank Heikens Jul 26 '22 at 00:21
  • I'm going to bet there is more then one row in JSON file with `"user_id": "om5ZiponkpRqUNa3pVPiRg"` and the error is in the second row. Easy enough to find out by adding `print(data['user_id'])` right after `data = json.loads(line)`. – Adrian Klaver Jul 26 '22 at 00:22
  • Another consideration you need. From above comment you have a line "'A' has friend 'B'" then at a later point (in same or future file) you have a line "'B' has friend 'A'". Now the question to you: Do these 2 lines indicate the same relationship? Do you want to store both lines or just one? – Belayer Jul 26 '22 at 00:24
  • 1
    @DanGetz I want the hasfriend table to take the user_id and the user_id of the friend (for each friend in the "friends" list), and store them in the hasfriend table with respect to the usertable (using the foreign keys). If it already sees friend A and friend B in the hasfriend table, disregard it. My initial assumption was that using both the user_id for the user and the friend to make the composite key for hasfriend table would give me distinct rows in hasfriend. – Ely Meyers Jul 26 '22 at 00:28
  • @FrankHeikens & Dan Getz Yes, hasfriend is empty before I run it. No there aren't any duplicates in the json for the two user_id's. They exist in multiple friends lists, but they each only have one user profile. Here's screenshots of me searching through the json: https://imgur.com/a/gMWl6Uo – Ely Meyers Jul 26 '22 at 00:31
  • @Belayer I only need friend A has friend B and don't need Friend B has Friend A because its redundant. I only need it in one direction – Ely Meyers Jul 26 '22 at 00:36
  • Does this answer your question? [How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?](https://stackoverflow.com/questions/17267417/how-to-upsert-merge-insert-on-duplicate-update-in-postgresql) – Dan Getz Jul 26 '22 at 00:36
  • @DanGetz Thanks but no it doesn't. I want one person to have multiple friends, and show the relation between one person and each friend. – Ely Meyers Jul 26 '22 at 00:41
  • @ElyMeyers I think you already handled that in your code with your primary key. – Dan Getz Jul 26 '22 at 02:39
  • OK, I found a question and answer that are more tailored to what you're looking to do: https://stackoverflow.com/q/1009584/3004881 – Dan Getz Jul 27 '22 at 00:28
  • `ON CONFLICT DO NOTHING` as mentioned in both those questions' answers – Dan Getz Jul 27 '22 at 00:34

0 Answers0