0

I am trying to run a workflow on GCP using Nextflow. The problem is, whenever an instance is created to run a process, it has two disks attached. The first boot-disk (default 10GB) and an additional 'google-pipelines-worker' disk (default 500GB). When I run multiple processes in parallel, multiple VM's are created and each has an additional disk attached of 500GB. Is there any way to customize the 500GB default?

nextflow.config

process {
    executor = 'google-pipelines'
}

cloud {
    driver = 'google'
}

google {
    project = 'my-project'
    zone = 'europe-west2-b'
}

main.nf

#!/usr/bin/env nextflow

barcodes = Channel.from(params.analysis_cfg.barcodes.keySet())

process run_pbb{
    machineType: n1-standard-2
    container: eu.gcr.io/my-project/container-1

    output:
    file 'this.txt' into barcodes_ch

    script:
    """
    sleep 500
    """
}

The code provided is jus a sample. Basically, this will create a VM instance with an additional 500GB standard persistent disk attached to it.

DUDANF
  • 2,618
  • 1
  • 12
  • 42

2 Answers2

2

Nextflow updated this in the previous release, will leave this here.

First run export NXF_VER=19.09.0-edge

Then in the scope 'process' you can declare a disk directive like so:

process this_process{
    disk "100GB"
}

This updates the attached persistent disk (default: 500GB)

There is still no functionality to edit the size of the boot disk (default: 10GB)

DUDANF
  • 2,618
  • 1
  • 12
  • 42
1

I have been checking the Nextflow documentation, where is specified:

The compute nodes local storage is the default assigned by the Compute Engine service for the chosen machine (instance) type. Currently it is not possible to specify a custom disk size for local storage.

ericcco
  • 741
  • 3
  • 15