3

I generated data from bulkloader using --reduce_shards=2 by following documentation at here

This generates 2 p directories on which I ran alphas following this link

Commands I used dgraph zero --my=IPADDR:5080 on top directory

Then I cd to out directory and run one alpha on /0/p using dgraph alpha --lru_mb=4096 --my=IPADDR:7080 --zero=localhost:5080

If I check ratel at this point it's all good

Then I cd to out directory and run second alpha on /1/p using dgraph alpha --lru_mb=4096 --my=IPADDR:7081 --zero=localhost:5080 -o=1

This runs fine, but data from this /1/p directory is not loaded, and ratel starts showing error in schema

Other 2 options I tried

  1. I did bulk load using --reduce_shards=1 and just running one alpha, everything works fine

  2. I stopped first alpha and ran alpha on /1/p the other predicates start showing up and it runs fine, but now /0/p data is gone

Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58
Aishwat Singh
  • 4,331
  • 2
  • 26
  • 48

1 Answers1

0

There's a known bug with multi-group bulk loading where data that should be served by other Alpha groups do not appear in queries. This will be fixed in the v1.0.12 release which will be available in the next day or so.

For now, a way you can do bulk-loading for multi-group clusters is to do the following:

  1. Run the Dgraph Bulk Loader for a single group (--reduce_shards=1 that outputs a single data directory at ./out/0/p)
  2. Start the first Alpha with the bulk loaded data directory, and then
  3. Start the other Alphas that join the cluster as members of different groups.

After 8 minutes (or after the duration set in --rebalance_interval), Dgraph Zero will rebalance the predicates among the different groups:

$ dgraph zero --help
...
      --rebalance_interval duration   Interval for trying a predicate move. (default 8m0s)
Daniel Mai
  • 341
  • 1
  • 7
  • ya that's what I am doing as of now, but isn't very good approach, and for million of records first rebalancing is slow, another thing, if I am doing insertions and alphas are getting rebalanced, it just fails – Aishwat Singh Feb 27 '19 at 09:20
  • You can disable rebalancing temporarily and enable it later on. https://github.com/dgraph-io/dgraph/pull/3065 – Daniel Mai Mar 01 '19 at 02:50