48

Update&Answer:

My misunderstand was:

All the imported/required files will be transformed by loader.

However, some imported/required files are not necessary to be transformed. For example, js files in "node_module" have been processed. So there is no need to be transformed again by Babel loader. That is basically why we need "exclude: /node_modules/" in loader.

Similarly, if you know what files to be transformed by a loader, you can use "include".

Simply put, entry.js will include all the imported/required files. But among these files, only a few of them need to be transformed. That is why "loader" introduces "include" and "exclude".


I am still not quite clear about the reasons why we need use "include" or "exclude" in loader of webpack.

Because the entry js file will always need to include its imported/required js files recursively. All the imported/required files will be transformed by loader. If that is the case, why do we need "include" or "exclude" in loader?

One common case is "exclude: /node_modules/". The thing that confuses me is that if the entry js file need some files from the node_modules and then we exclude the node_modules. Then the final bundle file will not contain the requied file from node_modules. In that case, the final bundle.js will not work correctly. Am I missing anything here?

module.exports = {
  entry: [
    './index.js'
  ],
  output: {
    path: path.join(__dirname,"public"),
    filename: 'bundle.js'
  },
  module: {
    loaders: [{
      test: /\.js$/,
      loader: 'babel',
      exclude: /node_modules/,
      query: {
          presets: ['es2015']
        }
    }]
  }
}; 

Thanks

Derek

derek
  • 9,358
  • 11
  • 53
  • 94

3 Answers3

57

The problem is that without that exclude (or an include) webpack would traverse through the dependencies when you point to them at your code and process them. Even though that could work, it would come with a heavy performance penalty.

I prefer to set up an include myself (allowlist over denylist/blocklist) as that gives me more control over the behavior. I include my application directory and then add more items to the include based on the need. This allows me to make exceptions easily and process bits from node_modules if absolutely needed.

Patrick
  • 6,495
  • 6
  • 51
  • 78
Juho Vepsäläinen
  • 26,573
  • 12
  • 79
  • 105
  • 1
    Then my question is if we exclude "node_modules", how can it work? Some functions from "node_modules" in the generated "bundle.js" will be missing. Then "bundle.js" cannot work properly. Correct? – derek Jun 15 '16 at 16:29
  • 2
    The assumption is that npm packages should be in such a form by default that they don't need any processing. There are rare exceptions but most should be useable out of the box. – Juho Vepsäläinen Jun 15 '16 at 17:29
  • 3
    I dont understand your point. Let me make my question more explicit: suppose our entry.js requires "react", since the generated "bundle.js" s a self-inclusive file, it should include "react" related js files which are from "node_modules", correct here? If that is the case and we exclude "node_modules", the "react" related js files won't be included in "bundle.js". Then the "bundle.js" cannot work because "react" functions are not there. – derek Jun 15 '16 at 18:17
  • 20
    It will include React related files. It just won't process them through Babel. That's explicitly what we want to avoid here. That's what `include`/`exclude` say. React will still get bundled. We just don't pass it through Babel as it wouldn't make sense. – Juho Vepsäläinen Jun 15 '16 at 19:55
  • 1
    +1, this is super helpful. Could you show an example of how you use `include` to pull in some `node_modules` to be processed in addition to your project? I need to do just that, and I think I have it right, but it would be great to get some confirmation. – Dominic P Oct 25 '18 at 19:29
  • @Dominic, It accepts an array of paths. You can also use a function for more complex logic. Better open a separate question and link me to it. – Juho Vepsäläinen Oct 26 '18 at 18:00
  • @JuhoVepsäläinen good idea. I posted a [question here](https://stackoverflow.com/q/53016083/931860). Thanks for your help. – Dominic P Oct 26 '18 at 20:42
45

The answers so far haven't really answered your core question. You want to know how your bundled app still manages to function even though its dependencies have been 'excluded'.

Actually, those 'include' and 'exclude' properties are telling the loaders whether to include/exclude the files described (such as the contents of node_modules), not webpack itself.

So the 'excluded' modules you import from node_modules will be bundled - but they won't be transformed by babel. This is usually the desired behaviour: most libraries are already transpiled down to ES 5.1 (or even ES 3), and so wasting CPU cycles parsing them with babel (for instance) is pointless at best. At worst, as in the case of large single-file libraries like jQuery, it can throw an error and bring your build to a crashing halt. So we exclude node_modules.

daemone
  • 1,183
  • 1
  • 17
  • 28
  • 4
    how to tell webpack to exlcude files then? – Shailesh Vaishampayan Jan 25 '18 at 12:52
  • 7
    @ShaileshVaishampayan I think [externals](https://webpack.js.org/configuration/externals/) are what you're looking for. Webpack will not bundle the packages you specify as external, and instead at runtime your app/library will expect them to be available in the environment. – daemone May 09 '18 at 08:24
3

Why do you need to customize webpack include/exclude settings at all if webpack bundles (or externalizes) all dependencies anyway?

This seems to be the OP's main question. The gist of my answer is similar to previous answers: because of bundler performance. Everything that is required/imported will be bundled or externalized. exclude does not change that, but only excludes files from transformation according to module.rules. You generally do not want to transform all bundled dependencies (e.g. node_modules), since those are usually already in a "very digestible" format for your application, thus not requiring an extra transformation pass. In short: if possible try to avoid transformation, or: "exclude good, include bad".

However, while this type of performance optimizations aims to reduce bundling time, it is not a perfect solution. In discussions about run-time performance optimization (will link when I find them again), you will find that it can be advantageous to transform (i.e. include) everything if the loader works to help improve global (cross-module) run-time performance optimization.

Another example of where it might be advantageous to include an already transformed library: imagine some library was transformed to replace async functions with a dependency on the dreaded regeneratorRuntime, which, in addition to slowing things down during runtime is notorious for causing a lot of pain. In that case, if you don't want that dependency, you might (with some effort) be able to get webpack to include, consume and transform its raw source files to re-compile with your own webpack config and keep the async functions, while still excluding most other node_modules.

How does "include" and "exclude" work in webpack loader?

The phrasing of the question title might lead some (such as myself) to come here from Google in order to better understand how to customize their webpack.config's include and exclude options since relevant official documentation is somewhat scattered. In short:

  1. exclude and include (and test and resource) are documented here.
  2. They are Conditions, which are documented here.
    • As you can see from that documentation, Conditions (i.e. include/exclude) can be functions or even arrays of functions, or mixture of function, regex, string etc.
    • I found the function bit particularly interesting. You can use that not only to better customize your conditions, but also to debug webpack resolution problems. It allows you to log and clearly understand all included/excluded files, as they are picked up by webpack's resolution algorithm. You can use functions like so:
      {
        test: /\.js$/,
        // ...
        include(resourcePath, issuer) {
          console.log(`  included: ${path.relative(context, resourcePath)} (from ${issuer})`);
          return true; // include all
        }
      }
      
  3. The two parameters to those functions (resourcePath and issuer) are documented here.
    • Here, they also mention that the resourcePath is the resolved path (not relative path nor name).

PS: While the use of functions for include/exclude is technically (kind of) documented, it is clearly not very obvious, as can be seen from the many many votes on this github issue comment.

Domi
  • 22,151
  • 15
  • 92
  • 122