I am wanting to fill in a web form and submit my query and download the resulting data. Some of the fields have the option of a drop-down menu or typing in a search query, sections can also be left blank (if all sections are left blank the entire database is downloaded), hitting the "search and download" button should instigate the downloading of a file.
Here is what I have tried (selecting all records for species "Salmo salar") based on this question. I used my browser (Opera) "Developer Tools" to inspect page elements and identify the names of all the possible fields:
library(httr)
url <- "https://nzffdms.niwa.co.nz/search"
fd <- list(
search_catchment_no_name = "",
search_river_lake = "",
search_sampling_locality = "",
search_fishing_method = "",
search_start_year = "",
search_end_year = "",
search_species = "Salmo salar", # species of interest
search_download_format = 1, # select csv file format
submit = "Search and Download"
)
POST(url, body = fd, encode = "form")
I had hoped this would result in a csv file being downloaded (all records for species "Salmo salar"), but no file downloads (but outputs this (list of 10, just showing the first bit):
Response [https://nzffdms.niwa.co.nz/search]
Date: 2019-10-02 23:35
Status: 200
Content-Type: text/html; charset=utf-8
Size: 19.1 kB
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; c...
<meta name="title" content="NZ Freshwater Fish Database...
<meta name="description" content="NIWA NZ Freshwater Fish...
<meta name="keywords" content="NIWA, NZ, Freshwater Fish" />
<meta name="language" content="en" />
<meta name="robots" content="index, follow />
...
Edit
I think the issue is with how I am calling the Search and download
button, when inspecting the web-page most fields look like this:
# end year field
<input maxlength="4" class="form-control" type="text" name="search[end_year]" id="search_end_year">
But the search and download
button elements don't have a name
or id
option:
<input type="submit" value="Search and Download" class="btn btn-primary btn-md">
Also I have just noticed there is a hidden field, maybe I need to define this?
<input type="hidden" name="search[_csrf_token]" value="d1530f09c1ce8110b5163bd100cb0d67" id="search__csrf_token">
Any advice on how I can get the file downloading would be much appreciated.
Update - warning
As of 2021-12-1 the database being queried in the above question has been significantly updated, the information in this question no longer accurately reflects the website, and the associated answer by chinsoon12 below will no longer return a result if submitted.