0

I’m trying to use the REST APi for Paperless-ngx to upload documents to a http server, their instructions are as follows..

POSTing documents

The API provides a special endpoint for file uploads:

/api/documents/post_document/

POST a multipart form to this endpoint, where the form field document contains the document that you want to upload to paperless. The filename is sanitized and then used to store the document in a temporary directory, and the consumer will be instructed to consume the document from there.

The endpoint supports the following optional form fields:

title: Specify a title that the consumer should use for the document.

created: Specify a DateTime document was created (e.g. “2016-04-19” or “2016-04-19 06:15:00+02:00”).

correspondent: Specify the ID of a correspondent that the consumer should use for the document.

document_type: Similar to correspondent.

tags: Similar to correspondent. Specify this multiple times to have multiple tags added to the document.

The endpoint will immediately return “OK” if the document consumption process was started successfully. No additional status information about the consumption process itself is available, since that happens in a different process

While I’ve been able to achieve what I needed with curl (see below), I’d like to achieve the same result with Lua.

curl -H "Authorization: Basic Y2hyaXM62tgbsgjunotmeY2hyaXNob3N0aW5n" -F "title=Companies House File 10" -F "correspondent=12" -F "document=@/mnt/nas/10.pdf" http://192.168.102.134:8777/api/documents/post_document/

On the Lua side, I’ve tried various ways to get this to work, but all have been unsuccessful, at best it just times out and returns nil.

Update: I’ve progressed from a nil timeout, to a 400 table: 0x1593c00 HTTP/1.1 400 Bad Request {"document":["No file was submitted."]} error message

Please could someone help ..

local http = require("socket.http")
local ltn12 = require("ltn12")
local mime = require("mime")
local lfs = require("lfs")

local username = "username"
local password = "password"

local httpendpoint = 'http://192.168.102.134:8777/api/documents/post_document/'
local filepath = "/mnt/nas/10.pdf"
local file = io.open(filepath, "rb")
local contents = file:read( "*a" )

-- https://stackoverflow.com/questions/3508338/what-is-the-boundary-in-multipart-form-data

local boundary = "somerndstring"
local send = "--"..boundary..
            "\r\nContent-Disposition: form-data; "..
            "title='testdoc'; document="..filepath..
            --"\r\nContent-type: image/png"..
            "\r\n\r\n"..contents..
            "\r\n--"..boundary.."--\r\n";

-- Execute request (returns response body, response code, response header)

local resp = {}
local body, code, headers, status = http.request {
    url = httpendpoint,
    method = 'POST',
    headers = {
        -- ['Content-Length'] = lfs.attributes(filepath, 'size') + string.len(send),
        -- ["Content-Length"] = fileContent:len(), 
        -- ["Content-Length"] = string.len(fileContent), 
        ["Content-Length"] = lfs.attributes(filepath, 'size'),
        ['Content-Type'] = "multipart/form-data; boundary="..boundary,
        ["Authorization"] = "Basic " .. (mime.b64(username ..":" .. password)),
        --body = send
    },
    source = ltn12.source.file( io.open(filepath,"rb") ),
    sink = ltn12.sink.table(resp)
}

print(body, code, headers, status)
print(table.concat(resp))

if headers then 
    for k,v in pairs(headers) do 
        print(k,v) 
    end
end 
nodecentral
  • 446
  • 3
  • 16

2 Answers2

0

Seems that your Content-Length header value exceeds your actual content length you're trying to send.

That causes remote server to wait more data from you, which you don't provide. As a result, connection is being terminated by a timeout.

Check your code:

local size = lfs.attributes(filepath, 'size') + string.len(send)

send variable already contains your file contents, so you should not add your file content length twice by calling lfs.attributes.

Just try this:

local size = string.len(send)

You also do not use send variable anywhere in the actual request, which is another mistake.

marsgpl
  • 552
  • 2
  • 12
  • Hi @marsgpl , good spot, I can adjust the size, and thanks also for highlighting the omission of `send`, although I don’t know where to put it. ?? I tried adding it on the end of the http.request, but that didn’t work. I’m really stuck now.. Any ideas? – nodecentral Oct 21 '22 at 19:26
  • @nodecentral check similar question here: https://stackoverflow.com/questions/32103600/uploading-an-image-using-luasocket try setting Content-Length to lfs.attributes(filepath, 'size') – marsgpl Oct 21 '22 at 21:14
  • Thanks @marsgpi, that’s helped, no further nil timeout response, but I do get a 400 error (see below) I assume this points to sending the form and file part, which is the `send` section that I don’t know here that goes..? Any ideas ? FYI Code above updated ? `1 400 table: 0x1593c00 HTTP/1.1 400 Bad Request {"document":["No file was submitted."]} x-content-type-options nosniff date Sat, 22 Oct 2022 08:29:33 GMT cross-origin-opener-policy same-origin referrer-policy same-origin content-language en-us content-length 39 ` etc. – nodecentral Oct 22 '22 at 08:34
  • @nodecentral try debug your requests with a wireshark. it'll be quite easy since you're using plain http. error literally says that you didn't submit a file, so check that you set `source` field correctly: it should look like `source = ltn12.source.file(io.open(filepath,"rb"))` – marsgpl Oct 22 '22 at 16:41
  • Hi @marsgpl, before setting up a machine to use wireshark, please could you help me with where the `send` part goes in the script. You’d highlighted that as a mistake earlier (which I agree) but I don’t know where it should go ? – nodecentral Oct 22 '22 at 20:07
  • @nodecentral you don't need to compose and use the `send` variable since you provide `source` field to the request – marsgpl Oct 23 '22 at 08:55
  • I’m so sorry @marsqpi, I’m just not getting this, it’s frustrating how such al (multi-part/form) Curl command can be so straightforward , yet to do the same thing with Lua code is so difficult :-) . If it helps, if you use the code shared in the original post to httpbin.org it does not seem to show the file being sent vs the same details in the Curl command ? – nodecentral Oct 28 '22 at 08:42
  • You've spelled my name wrong. And I did not get your question. You are using a third party library that just implements sockets and http as lib's author wishes, it is not a standard way of using HTTP with Lua. I guess Diego Nehab does not maintain it anymore since it was removed from his page. Try reading lib's C code, it will help you to figure your problem out. – marsgpl Oct 29 '22 at 11:13
  • Thanks @marsgpi, I’m not familiar with `lib’s C code` or what that means, but I’ll take a look - thanks again for your patience with me. While I couldn't get this matter resolved in the way I hoped, I assumed the use of lua sockets was a pretty standard implementation/use case - but I’m always learning.. thanks again.. (We’ll leave this for now as unresolved, there is no standard / recommended way to do mutipart/form data submissions using Lua.. – nodecentral Oct 30 '22 at 11:07
0

Huge thanks to a person on GitHub who helped me with this, and also has their own module to do it - https://github.com/catwell/lua-multipart-post .

local http = require("socket.http")
local ltn12 = require("ltn12")
local lfs = require "lfs"
http.TIMEOUT = 5

local function upload_file ( url, filename )
    local fileHandle = io.open( filename,"rb")
    local fileContent = fileHandle:read( "*a" )
    fileHandle:close()

    local boundary = 'abcd'

    local header_b = 'Content-Disposition: form-data; name="document"; filename="' .. filename .. '"\r\nContent-Type: application/pdf'
    local header_c = 'Content-Disposition: form-data; name="title"\r\n\r\nCompanies House File'
    local header_d = 'Content-Disposition: form-data; name="correspondent"\r\n\r\n12'

    local MP_b = '--'..boundary..'\r\n'..header_b..'\r\n\r\n'..fileContent..'\r\n'
    local MP_c = '--'..boundary..'\r\n'..header_c..'\r\n'
    local MP_d = '--'..boundary..'\r\n'..header_d..'\r\n'

    local MPCombined = MP_b..MP_c..MP_d..'--'..boundary..'--\r\n'

    local   response_body = { }
    local   _, code = http.request {
            url = url ,
            method = "POST",
            headers = {    ["Content-Length"] =  MPCombined:len(),
                           ['Content-Type'] = 'multipart/form-data; boundary=' .. boundary
                         },
            source = ltn12.source.string(MPCombined) ,
            sink = ltn12.sink.table(response_body),
                }
     return code, table.concat(response_body)
end

 local rc,content = upload_file ('http://httpbin.org/post', '/mnt/nas/10.pdf' )
 print(rc,content)
nodecentral
  • 446
  • 3
  • 16