3

After writing the code, When I run my pgloader script, it fails this way:

load database
    from mysql://xxx:yyy127.0.0.1/zzz
    into pgsql://xxx:yyy@localhost/zzz

 including only table names matching 'TABLE'

with
data only,
create no tables, preserve index names,
batch rows = 1000,
batch size = 500 MB,
prefetch rows = 1000
-- on error stop,

set work_mem to '2048 MB', maintenance_work_mem to '4096 MB';



-- before load do $$ drop schema if exists jobs cascade; $$;

I get this error message, but this is intermittent, that is, it doesn't always happen, and I'm not sure what parameters to put. I have plenty of ram, the records are about 50kb each.

2020-05-25T04:28:14.194000Z INFO Incomplete Foreign Key definition: constraint "fk_job_currency1" on table "jobs2_beta2.job" referencing table NIL
2020-05-25T04:28:14.194000Z INFO Incomplete Foreign Key definition: constraint "fk_job_job_category1" on table "jobs2_beta2.job" referencing table NIL
2020-05-25T04:28:14.194000Z INFO Incomplete Foreign Key definition: constraint "fk_job_organization1" on table "jobs2_beta2.job" referencing table NIL
2020-05-25T04:28:14.194000Z INFO Incomplete Foreign Key definition: constraint "fk_job_resource1" on table "jobs2_beta2.job" referencing table NIL
2020-05-25T04:28:14.194000Z INFO Incomplete Foreign Key definition: constraint "fk_job_user1" on table "jobs2_beta2.job" referencing table NIL
2020-05-25T04:28:14.198000Z SQL MySQL: sending query: -- params: db-name
--         table-type-name
--         only-tables
--         only-tables
--         including
--         filter-list-to-where-clause incuding
--         excluding
--         filter-list-to-where-clause excluding
  SELECT table_name, index_name, index_type,
         sum(non_unique),
         cast(GROUP_CONCAT(column_name order by seq_in_index) as char)
    FROM information_schema.statistics
   WHERE table_schema = 'jobs2_beta2'
         and (table_name = 'job')

GROUP BY table_name, index_name, index_type;
2020-05-25T04:28:14.225000Z INFO Processing source catalogs
2020-05-25T04:28:14.272000Z NOTICE Prepare PostgreSQL database.
2020-05-25T04:28:14.275000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://ziprecruiter@localhost:5432/ziprecruiter {1006A2A683}>
2020-05-25T04:28:14.275000Z DEBUG SET client_encoding TO 'utf8'
2020-05-25T04:28:14.275000Z DEBUG SET work_mem TO '1024 MB'
2020-05-25T04:28:14.276000Z DEBUG SET maintenance_work_mem TO '4096 MB'
2020-05-25T04:28:14.276000Z DEBUG SET application_name TO 'pgloader'
2020-05-25T04:28:14.280000Z DEBUG BEGIN
2020-05-25T04:28:14.314000Z SQL DROP TABLE IF EXISTS jobs2_beta2.job CASCADE;
2020-05-25T04:28:15.316000Z SQL CREATE TABLE jobs2_beta2.job
(
  id                       bigserial not null,
  resource_id              bigint not null,
  url                      text not null,
  job_title                text,
  html_job_description     text,
  text_job_description     text,
  last_crawl_date          timestamptz,
  first_indexed_date       timestamptz,
  job_category_id          bigint not null,
  currency_id              bigint not null,
  salary_exact             double precision,
  salary_is_range          smallint,
  salary_range_start       double precision,
  salary_range_end         double precision,
  address                  text,
  source_program           varchar(255),
  organization             varchar(255),
  organization_count       bigint,
  expiration_date          timestamptz,
  is_sponsored             smallint,
  is_hidden                smallint,
  educational_requirements text,
  experience_requirements  text,
  destination              text,
  organization_id          bigint not null,
  user_id                  bigint not null,
  is_expired               smallint,
  salary_periodicity       varchar(255),
  json_schema              text,
  clean_job_description    text,
  estimated_job_category   varchar(255)
);
2020-05-25T04:28:15.326000Z SQL -- params: table-names
select n, n::regclass::oid
  from (values ('jobs2_beta2.job')) as t(n);

2020-05-25T04:28:15.392000Z NOTICE COPY jobs2_beta2.job
2020-05-25T04:28:15.392000Z DEBUG Reader started for jobs2_beta2.job
2020-05-25T04:28:15.405000Z DEBUG start jobs2_beta2.job   1400
2020-05-25T04:28:15.407000Z INFO COPY ON ERROR STOP
2020-05-25T04:28:15.408000Z DEBUG CONNECTED TO #<MYSQL-CONNECTION mysql://jobs_client2@127.0.0.1:3306/jobs2_beta2 {100A4E61B3}>
2020-05-25T04:28:15.408000Z SQL MySQL: sending query: SELECT `id`, `resource_id`, `url`, `job_title`, `html_job_description`, `text_job_description`, `last_crawl_date`, `first_indexed_date`, `job_category_id`, `currency_id`, `salary_exact`, `salary_is_range`, `salary_range_start`, `salary_range_end`, `address`, `source_program`, `organization`, `organization_count`, `expiration_date`, `is_sponsored`, `is_hidden`, `educational_requirements`, `experience_requirements`, `destination`, `organization_id`, `user_id`, `is_expired`, `salary_periodicity`, `json_schema`, `clean_job_description`, `estimated_job_category` FROM `job`
2020-05-25T04:28:15.416000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://ziprecruiter@localhost:5432/ziprecruiter {100A8C11A3}>
2020-05-25T04:28:15.416000Z DEBUG SET client_encoding TO 'utf8'
2020-05-25T04:28:15.416000Z DEBUG SET work_mem TO '1024 MB'
2020-05-25T04:28:15.416000Z DEBUG SET maintenance_work_mem TO '4096 MB'
2020-05-25T04:28:15.416000Z DEBUG SET application_name TO 'pgloader'
2020-05-25T04:28:15.416000Z SQL SET search_path TO jobs2_beta2;
2020-05-25T04:28:15.417000Z INFO pgsql:copy-rows-from-queue[0]: jobs2_beta2.job (id resource_id url job_title
                                                html_job_description
                                                text_job_description
                                                last_crawl_date
                                                first_indexed_date
                                                job_category_id currency_id
                                                salary_exact salary_is_range
                                                salary_range_start
                                                salary_range_end address
                                                source_program organization
                                                organization_count
                                                expiration_date is_sponsored
                                                is_hidden
                                                educational_requirements
                                                experience_requirements
                                                destination organization_id
                                                user_id is_expired
                                                salary_periodicity json_schema
                                                clean_job_description
                                                estimated_job_category)

Gen  Boxed Unboxed   LgBox LgUnbox  Pin       Alloc     Waste        Trig      WP GCs Mem-age
 0       0       0       0       0    0           0         0    42949672       0   0  0.0000
 1     577   42719       0      36    5  1301487024 118415952   753961944       0   1  1.3187
 2    1776   83008      25      60   65  2579123472 201863920     2000000     668   0  0.8672
 3       0       0       0       0    0           0         0     2000000       0   0  0.0000
 4       0       0       0       0    0           0         0     2000000       0   0  0.0000
 5       0       0       0       0    0           0         0     2000000       0   0  0.0000
 6    1593    1278       0       0    0    90993120   3083808     2000000    1501   0  0.0000
 7       0       0       0       0    0           0         0     2000000       0   0  0.0000
           Total bytes allocated    =    3971603616
           Dynamic-space-size bytes =    4294967296
GC control variables:
   *GC-INHIBIT* = true
   *GC-PENDING* = true
   *STOP-FOR-GC-PENDING* = false
fatal error encountered in SBCL pid 9514(tid 0x7ffff492f700):
Heap exhausted, game over.

Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>


I tried fiddling with memory the parameters, can anybody provide assistance on this issue?

Felipe Valdes
  • 1,998
  • 15
  • 26

3 Answers3

2

Found a github issue related to "Heap exhausted, game over", might be helpful.

https://github.com/dimitri/pgloader/issues/327

0

Just got here after hitting the same problem with the Heap exhausted. not matter what I do or set - it always "runs for some datasets" than breaks with the Heap Exhaust.

fatal error encountered in SBCL pid 53779 tid 53794:
Heap exhausted, game over.

Welcome to LDB, a low-level debugger for the Lisp runtime environment.

I saw in the github thread https://github.com/dimitri/pgloader/issues/327

following message: on Jan 22, 2016 I've managed to import my database using another tool and dumping tables one-by-one, looks like my DB was fine, there is some error in pgloader, I suppose.

Iam supposing the same at the moment. There must be a major problem in pgloader; In the comment was no info about what Tool finaly was able to load the data from mysql to postgresql.

maybe this helps if someone is stuck at the same Heap problems. There is no solution yet (for me).

svens
  • 31
  • 3
  • here are other converters that might be helpfull. https://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_PostgreSQL – svens May 14 '23 at 08:12
  • I think at the time it was solved by changing the pgloader configuration file, the file is the key, there was nothing wrong with pgloader, we just had to change the conf file, maybe share your config file. @svens do keep in mind it may take a couple tries – Felipe Valdes May 19 '23 at 10:43
0

In my case, this parameter fixed the error:

prefetch rows = 1000

Remy
  • 502
  • 6
  • 19