I'm retrieving a gzipped csv file from an FTP server and storing it in Google Cloud Storage. I need another GCP service, Dataprep, to read this file. Dataprep works only with csv, it can't unzip it on the fly.
So, what would be the proper way to unzip it? Here is my code:
import FTPClient from 'ftp'
const file = bucket.file(path)
var ftpServer = new FTPClient()
ftpServer.on('ready', () => {
ftpServer.get('/file.gz', (err, stream) => {
if (err) throw err
stream.once('close', () => {
ftpServer.end()
resolve(true)
})
stream.pipe(
file.createWriteStream({
resumable: false,
public: false,
gzip: true
})
)
})
})
ftpServer.connect({
host: 'somehost.com',
user: 'user',
password: '******'
})
I've seen this question. I'm not sure if this is the optimal solution. As far as I understand, that code will read the file, load it to my server memory and then write it back. This seems like a huge waste of memory and traffic. Is there a better way to unzip it?