0

I need to copy data from json file to Postgresql database. I have a json file that have 9000 users with information about them, that looks like this:

"name": "Kathryn", "time_created": 1665335716, "gender": "female", "age": 38, "last_name": "Smith", "ip": "192.168.0.110", "city": "NY", "premium": null,  "birth_day": "01.01", "balance": 55.43255500944704, "user_id": 8676}

I need to copy data from this file to Postgresql. How can I do this by sql or python. Postgresql database is in local docker compose container

jian
  • 4,119
  • 1
  • 17
  • 32
  • refer this url : https://stackoverflow.com/questions/39224382/how-can-i-import-a-json-file-into-postgresql – Learn Hadoop Nov 28 '22 at 15:49
  • that's not help me, sql had an error: – George Rybojchuk Nov 28 '22 at 16:00
  • Or through python read json file ( import json ) and insert into postgres. Hope it will should work – Learn Hadoop Nov 28 '22 at 16:07
  • try this script. create table sample_json(record json not null); insert into sample_json(record) values('{"name": "Kathryn", "time_created": 1665335716, "gender": "female", "age": 38, "last_name": "Smith", "ip": "192.168.0.110", "city": "NY", "premium": null, "birth_day": "01.01", "balance": 55.43255500944704, "user_id": 8676}') select * from sample_json; select record->'name',record->'time_created' from sample_json;. – Learn Hadoop Nov 28 '22 at 16:13
  • Create your main table and and insert into using select statement from sample_json table. for example insert into main_table(name,time_created) select record->'name',record->'time_created' from sample_json; – Learn Hadoop Nov 28 '22 at 16:15

1 Answers1

0

@George Rybojchuk

check complete sql :

drop table if exists sample_json;
drop table if exists target;
create table sample_json(record json not null);
create table target(name varchar(20),time_created varchar(20));
insert into sample_json(record) values('{"name": "Kathryn", "time_created": 1665335716, "gender": "female", "age": 38, "last_name": "Smith", "ip": "192.168.0.110", "city": "NY", "premium": null,  "birth_day": "01.01", "balance": 55.43255500944704, "user_id": 8676}');
select * from sample_json;
insert into target(name,time_created)
select record->'name',record->'time_created'
from sample_json;

python approach:

import json

if __name__ == '__main__':

    records = [
        '{"name": "Kathryn", "time_created": 1665335716, "gender": "female", "age": 38, "last_name": "Smith", "ip": "192.168.0.110", "city": "NY", "premium": null,  "birth_day": "01.01", "balance": 55.43255500944704, "user_id": 8676}',
        '{"name": "Kathryn1", "time_created": 1665335716, "gender": "female", "age": 38, "last_name": "Smith", "ip": "192.168.0.110", "city": "NY", "premium": null,  "birth_day": "01.01", "balance": 55.43255500944704, "user_id": 8676}',
        '{"name": "Kathryn2", "time_created": 1665335716, "gender": "male", "age": 38, "last_name": "Smith", "ip": "192.168.0.110", "city": "NY", "premium": null,  "birth_day": "01.01", "balance": 55.43255500944704, "user_id": 8676}',

    ]

    for record in records:
        json_data = json.loads(record)

        name = json_data['name']
        gender = json_data['gender']
        print(name, gender)
        # connect postgresql
        # insert into target table
Learn Hadoop
  • 2,760
  • 8
  • 28
  • 60
  • Sorry for stupid question, but I need to read all users from file. File has a 9000 users, how can I do that?, describe please – George Rybojchuk Nov 28 '22 at 16:39
  • updated python approach.. check now – Learn Hadoop Nov 28 '22 at 16:55
  • Revise the SQL approach. At fifth step (insert into sample_jason ...) instead use the [copy\(https://www.postgresql.org/docs/current/sql-copy.html) command to load the entire file. You them need only to expand the `insert ... select` column lists for each column. This will process all records from the file regardless of how many rows it contains. – Belayer Nov 28 '22 at 17:50
  • `Age` is a poor data point, it has to be updated regularly (daily). Better to store birth date as data type `date`. This can be derived in this case from the json values `age` and `birth_day`. Derive `age` as needed or create a view with it. – Belayer Nov 28 '22 at 18:00