2

I'm using gatsby to build a very large site (5k+ pages, 300k+ images). The source data is unreliable (e.g. fields are often missing), which leads to errors during the createPage process.

The issue is that if one single createPage run throws an error, the entire build fails. So sometimes 5k pages build successfully, then the whole thing crashes because of one error.

I tried wrapping the page creation in a try...catch but it made no difference:

      try {
        createPage({
          path: node.slug,
          component: path.resolve(`./src/templates/BlogPost.js`),
          context: {
            id: node.id,
          },
        });
      } catch (error) {
        console.log(error);
      }

(I also tried checking the data at the component level and returning null if it's not complete, but createPage still creates a (blank) page, and I don't want that: I just want the page to be skipped if the data is bad)

So my question is: how can one handle errors / failed page creation during the build process so that failed pages are just skipped instead of crashing the whole build?

NB this is almost a duplicate of this question, but the solution there doesn't work for me: I can't render an error page in case of bad data, I need the page to be skipped entirely if this is at all possible

rubie
  • 1,906
  • 2
  • 20
  • 26
  • 1
    You could validate the data before passing it to createPage? – Albert Skibinski Apr 21 '20 at 10:13
  • Yeah this is ultimately the solution, isn't it - I was really trying to avoid that because our build times are already crazy and validating deep json on every product is just going to add to that. But if there's really no other way that's what I'd have to do :( – rubie Apr 21 '20 at 11:28
  • 1
    300k images...Woooow! – mkEagles Apr 21 '20 at 11:57

1 Answers1

4

You should explicitly define the GraphQL schema for your source data: https://www.gatsbyjs.org/docs/schema-customization/#creating-type-definitions

This way the GraphQL call will not return an error but null for missing fields on the node. You then can check for these falsy values and skip the createPage call.

For example in this theme of mine I explicitly define the GraphQL schema for the Page type: https://github.com/LekoArts/gatsby-themes/blob/567858957ca484aef8114a7d0b8e4f14df0c8a00/themes/gatsby-theme-minimal-blog-core/gatsby-node.js#L90-L96

https://github.com/LekoArts/gatsby-themes/blob/567858957ca484aef8114a7d0b8e4f14df0c8a00/themes/gatsby-theme-minimal-blog-core/gatsby-node.js#L111-116

And if the enduser didn't create any pages (by creating files inside content/pages) the query will return null and I can check against that: https://github.com/LekoArts/gatsby-themes/blob/567858957ca484aef8114a7d0b8e4f14df0c8a00/themes/gatsby-theme-minimal-blog-core/gatsby-node.js#L337-L347

LekoArts
  • 1,521
  • 1
  • 7
  • 7
  • Absolutely bang on, just what I need - thanks for sharing the repo as well, so useful to see how this works in practice – rubie Apr 21 '20 at 13:19