I'm trying to take a local dataframe and create it as a parquet table in Drill using the PyODBC library. I understand that PyDrill has better features suited for this but I am able to create and read data from the Drill instance- I only struggle…
New programmer with SQL and Apache Drill here. I'm trying to take this SQL command from DB1:
SELECT screen_name, job_id, count(*) as counter
from twitter.mention t
WHERE t.job_id = 290
or t.job_id = 261
or t.job_id = 303
group by screen_name,…
I have a Dask data frame that has two columns, a date and a value.
I store it like so:
ddf.to_parquet('/some/folder', engine='pyarrow', overwrite=True)
I'm expecting Dask to store the date column as date in Parquet, but when I query it with Apache…
I'm running a 5 Nodes Mapr Drill cluster, and everything is working fine, except that sometimes (can be multiple time during the day, sometimes once in a few days, no specific pattern), when I try to connect to one of the drillbits (Via Drill Web-UI…
I am trying to create MySQL apache drill storage plugin using pydrill. It is throwing error:
RequestError: TransportError(400, 'Unrecognized field "type" (class org.apache.drill.exec.server.rest.PluginConfigWrapper), not marked as ignorable (2…
I am using pydriller to extract metrics from certain github repos.
while implementing a branch specific extraction function as you can see in the following code.
the terminal returns the following errors: Problem reading repository at repoX
from…
I am trying to start the Apache-drill but I am facing this error continuesley. How can I figure it out?
Error: Failure in starting embedded Drillbit: java.io.IOException: Failed to bind to 0.0.0.0/0.0.0.0:8047 (state=,code=0)
I tried to edit…
I am able to load a csv into pandas dataframe, but it is stuck in a list. How can I load directly into a pandas dataframe from Pydrill or unlist the pandas dataframe columns and data? I've tried unlisting and it puts everything into a list of a…
I developed 2 codes but the goal is the same.
The first, it submit a query to apache drill by pydrill. The query in this case is a lot of select comands and a union all between them, then I save the result in a dataframe.
The second code, it submit…
I have a parquet dataset where I saved a byte_array.
I am using Apache Drill to query the dataset:
SELECT id, x, y FROM `dfs.root`.`./data`
This gives me:
+--------------------------------------+-------------+-------------+
| ID …