I want to query a couchbasedb bucket from R and store the results in a data frame.
I went through this blogpost and tried to replicate the steps in my own cluster using custom query, but got the error message in couchbase logs
Invalid post received: {mochiweb_request,
[#Port<0.5548256>,'POST',"/query/service/",
{1,1},
{6,
{"host",
{'Host',
"[removed]:8091"},
{"accept-encoding",
{'Accept-Encoding',"gzip, deflate"},
{"accept",
{'Accept',
"application/json, text/xml, application/xml, */*"},
nil,nil},
{"content-type",
{'Content-Type',
"application/x-www-form-urlencoded;charset=UTF-8"},
{"content-length",
{'Content-Length',"59"},
nil,nil},
nil}},
{"user-agent",
{'User-Agent',
"libcurl/7.54.0 r-curl/2.6 httr/1.2.1"},
nil,nil}}}]}
Then I tried to use the reticulate
package in R to query couchbasedb using the python SDK.
Python Code:
from couchbase.n1ql import N1QLQuery
from couchbase.bucket import Bucket
import pandas as pd
host = '[host_name]:8091'
bucket = 'my-bucket'
cb = Bucket('couchbase://' + host + '/' + bucket)
query = N1QLQuery('Select * from `my-bucket`')
df = pd.DataFrame()
for row in cb.n1ql_query(query):
df = df.append(row, ignore_index=True)
The code above works perfectly fine and appends the pandas data frame df
with expected values.
Below is my unsuccessful attempt to translate the above python code to R using the reticulate
function
R Code:
library(reticulate)
reticulate::use_condaenv("my-env", "/usr/local/anaconda3/bin/conda")
Bucket <- reticulate::import("couchbase.bucket")$Bucket
N1QLQuery <- reticulate::import("couchbase.n1ql")$N1QLQuery
pd <- reticulate::import("pandas", "pd")
host <- '[host_name]:8091'
bucket <- 'my-bucket'
cb <- Bucket(paste0('couchbase://', host, '/', bucket))
query = N1QLQuery('Select * from `my-bucket`')
Up to this point everything works fine.
Now, how can I translate the for loop
in python to R that will append query results into the data frame?
for row in cb.n1ql_query(query):
df = df.append(row, ignore_index=True)
I tried to use the reticulate::iterate()
, but it throws an error. Most likely because I'm not using this function correctly.
> reticulate::iterate(cb$n1ql_query(query), print)
Error in reticulate::iterate(cb$n1ql_query(query), print) :
iterate function called with non-iterator argument
The last resort would be to use rPython
package to directly call the python script, but even this doesn't look like a straightforward task.
Any working solution would work. I don't mind how do we get the R data frame.
Help is much appreciated :)