I am hitting a OneDrive/Sharepoint shared link of an Excel file, trying to download it and save to S3.
CSV does work; .xlsx does not
I am using Puppeteer so the code looks something like this:
var downloadURL = "https://netorgft7979143-my.sharepoint.com/:x:/g/personal/gabe_scoop_report/ETCqa1EwxrVNiPYD0aLIq44BJUpEFLYIhcKOFWXuNnYPXQ?download=1"
const buffer = await page.evaluate(({downloadURL}) =>
{
return fetch(downloadURL, {
method: 'GET'
}).then(r => r.text());
}, {downloadURL});
...
const s3result = await s3
.upload({
Bucket: S3BucketPath,
Key: `${Date.now()}.csv`,
Body: buffer,
ContentType: 'text'})
.promise()
Again, this totally works when the endpoint is CSV. When it's an Excel file, the bits do get written to S3; but, not as a valid Excel file.
The URL above is real (sample data so feel free to hit it) and if you run this code, you will see an 11k file written to S3, but Excel will complain the format is invalid.
I am 99%+ sure it has something to do with binary vs. text, and have spent 2 days poring over SO trying anything from base64 conversion to .blob() or .buffer(), different content-types for S3... but nothing did the trick. I am also 99%+ sure Puppeteer has nothing to do with the problem, tho wrapping the .fetch() within page.evaluate() does make it harder to do things like then(r => r.buffer())
complaining buffer is not a function...
Ideas? Thx!