1

I am using gatsby as a static site generator with drupal as the backend. The data is retrieved with the gatsby-source-drupal plugin. My configuration for the plugin is as follows:

module.exports = {
  //Excluded irrelevant configurations
  plugins: [
    {
      resolve: 'gatsby-source-drupal',
      options: {
        baseUrl: 'https://example.org',
        apiBase: 'jsonapi',
        concurrentFileRequests: 120,
        disallowedLinkTypes: [
          //Standard disallows
          'self',
          'describedby',
          //Erroneous resources
          'block--block',
          'field_storage_config--field_storage_config',
          'menu--menu'
        ],
      },
    },
  ],
};

Now I have a strange phenomenon that I can retrieve all the data using my host machine but this fails within a docker container with the error:

ERROR #11321  PLUGIN
"gatsby-source-drupal" threw an error while running the sourceNodes lifecycle:
connect ETIMEDOUT <Ip Address>:443

This failure happens for random backend collections, eliminating the possibility that a specific collection is problematic. Retrieving the failing resource within the container with curl is successful. I don't think that a server limit is a problem too, since the data retrieval functions on my host machine. I compared the memory usage for both node installations (v12) using process.memoryUsage() and got similar results.

Could there be a difference between the node process in the host machine and docker container which could be causing problems?

Awemo
  • 875
  • 1
  • 12
  • 25

1 Answers1

2

I've previously run into sporadic issues with timeouts on Node when opening many parallel connections. Since you've set concurrentFileRequests to a rather high value, this might be the issue.

I'd recommend first trying to lower concurrentFileRequests to maybe 10 and seeing if the problem goes away.

If that's the case and you need higher parallelism (though I think it's unlikely you'll see much improved performance with higher values), you can try raising UV_THREADPOOL_SIZE and seeing if that works:

See also this issue: NodeJS request timeouts with concurrency 100

ehrencrona
  • 6,102
  • 1
  • 18
  • 24
  • Thanks your answer and sorry for taking so long to respond. I have since changed my work laptop (Lenovo with Ubuntu 20.04 to a Macbook Pro) and I am not experiencing the issue anymore. I am suspecting that the issue resulted from the way my docker was configured on the old device, but I have not been able to identify the difference. Do you perchance know how I can get the current value for `UV_THREADPOOL_SIZE`? I have not been able to find any instructions on how to do that. – Awemo Jul 08 '20 at 07:37