2

One thing I don't like about Node is that as soon as you add one require("whatever") you end up with 1000's of transitive dependencies which call require on the off chance that the code might be needed.

var whatever = require('whatever');
if (probablyFalse) {
   whatever.theOnlyFunctionThatIUse(); 
   // ...but `whatever` et al require other libraries which I won't actually use
}

I want to be build a package to deploy on Google Cloud Functions (and similar apps on Lambda). My code imports @google-cloud/datastore which has many transitive dependencies, some of which have binary files, computed imports etc. I don't want to run into package size limitations or increase the time that it takes for Node to parse the code. I want to use a packaging tool that does tree shaking and compiles (most of) my code and dependencies into one file. I want to be able to specify which libraries to exclude from index.js and provide only the necessary files under node_modules.

Because I'm compiling Typescript and using other libraries in my build/test/package/deploy process, node_modules contains 100s-1000s of libraries, most of which aren't needed in production.

Ideally, I'd like to be able to build something that looked like:

  • package.json - {"main": "index.js", dependencies: { "@google-cloud/datastore": "1.4.1" }}
  • index.js - compiled from multiple TypeScript files in my project and most of the code I'm importing from libraries and transitive dependencies
  • node_modules - all of, but only the code that is not included in index.js but is required to run the app.

I've created a simple demo app to show what I'm trying to do (currently I'm using FuseBox):

https://github.com/nalbion/packaged-google-function/blob/master/lib/demo.js

To exclude @google-cloud/datastore and it's transitive dependencies from my compiled demo.js I've added a filterFile:

filterFile: file => {
    return !['@google-cloud/datastore'].includes(file.collection.name);
},

I'm confused by the lines in the output:

FuseBox.pkg("@google-cloud/datastore", {}, function(___scope___){
    return ___scope___.entry = "src/index.js";
});

Google Cloud Functions is also confused:

TypeError: Cannot read property 'default' of null
    at helloWorld (/user_code/demo.js:10:42)

For reference, the demo was working until I tried to add the datastore code:

https://github.com/nalbion/packaged-google-function/blob/no-dependencies/lib/demo.js

I suspect that filterFile is not intended for this purpose, or maybe I'm using it wrong.

Is there an equivalent in FuseBox to filter packages?

Is there a better way of doing this?

(Edit) There's a known issue with private git repos:

https://github.com/GoogleCloudPlatform/nodejs-docs-samples/issues/300

Auto deploy Google Cloud Functions from Google Cloud Source Control

Nicholas Albion
  • 3,096
  • 7
  • 33
  • 56

1 Answers1

0

You're going about doing too much work unnecessarily.

Google Cloud Functions automatically handles dependencies for you by installing them on the server with npm after you deploy (assuming the dependencies are listed in your package.json). It doesn't upload the contents of node_modules. Don't bother trying to create a materialized version of your dependencies, unless you really don't want GCF to install them from npm automatically.

Doug Stevenson
  • 297,357
  • 32
  • 422
  • 441
  • Does that impact start-up time? When the first request comes in on the morning does GCF need to download from 10,000 git repos, and then does Node have to parse them all? I'd imagine it would be faster if there was 1 primary file which had been "tree shaked" and thus much less code, and node_modules already had everything required. – Nicholas Albion Jul 05 '18 at 02:45
  • The docs at that link say "If you are deploying through gcloud...". I was trying to deploy using my own JS script & the REST API: cloudfunctions.projects.locations.functions.create/patch. I don't think GCF install the deps when I do it that way. – Nicholas Albion Jul 05 '18 at 02:55
  • 1
    The install happens after deploy, before the first request is serviced. Practically speaking, I've never heard of anyone saying that period of time is particularly long. I strongly suspect that gcloud is using the same public APIs for deployment that anyone else would use. – Doug Stevenson Jul 05 '18 at 03:09
  • 1
    Do they respect `package-lock.json`? They mention private modules, but wouldn't I need to provide an SSH key to download from a private BitBucket repo? `"my-module": "bitbucket:nalbion/my-module#develop",` – Nicholas Albion Jul 05 '18 at 03:11
  • I don't know. It sounds like you're going down the road of asking a completely different question on SO. – Doug Stevenson Jul 05 '18 at 03:13
  • I believe, however, that package-lock.json is an npm concept, so it probably depends on the version of npm in play on the Cloud Functions side. – Doug Stevenson Jul 05 '18 at 03:15
  • There's a known issue with private git repos: https://github.com/GoogleCloudPlatform/nodejs-docs-samples/issues/300 https://stackoverflow.com/questions/47643386/auto-deploy-google-cloud-functions-from-google-cloud-source-control/48568937#48568937 – Nicholas Albion Jul 05 '18 at 12:53