Questions tagged [pydrill]

Python driver for Apache Drill

Python driver for Apache Drill.

https://github.com/PythonicNinja/pydrill

10 questions
2
votes
0 answers

Using PyODBC to create Apache Drill Tables

I'm trying to take a local dataframe and create it as a parquet table in Drill using the PyODBC library. I understand that PyDrill has better features suited for this but I am able to create and read data from the Drill instance- I only struggle…
rookieJoe
  • 509
  • 6
  • 14
2
votes
1 answer

SQL and apache drill

New programmer with SQL and Apache Drill here. I'm trying to take this SQL command from DB1: SELECT screen_name, job_id, count(*) as counter from twitter.mention t WHERE t.job_id = 290 or t.job_id = 261 or t.job_id = 303 group by screen_name,…
Ryan
  • 71
  • 2
  • 7
1
vote
3 answers

Storing with Dask date/timestamp columns in Parquet

I have a Dask data frame that has two columns, a date and a value. I store it like so: ddf.to_parquet('/some/folder', engine='pyarrow', overwrite=True) I'm expecting Dask to store the date column as date in Parquet, but when I query it with Apache…
ps0604
  • 1,227
  • 23
  • 133
  • 330
1
vote
0 answers

Authentication to Apache Drill is temporary failing

I'm running a 5 Nodes Mapr Drill cluster, and everything is working fine, except that sometimes (can be multiple time during the day, sometimes once in a few days, no specific pattern), when I try to connect to one of the drillbits (Via Drill Web-UI…
kfy
  • 11
  • 1
1
vote
1 answer

Use pydrill storage_update() to create Apache drill storage

I am trying to create MySQL apache drill storage plugin using pydrill. It is throwing error: RequestError: TransportError(400, 'Unrecognized field "type" (class org.apache.drill.exec.server.rest.PluginConfigWrapper), not marked as ignorable (2…
0
votes
0 answers

Repositories do not respond when using pydriller to extract data from a specific branch

I am using pydriller to extract metrics from certain github repos. while implementing a branch specific extraction function as you can see in the following code. the terminal returns the following errors: Problem reading repository at repoX from…
0
votes
1 answer

Error: Failure in starting embedded Drillbit: java.io.IOException(Ubuntu)

I am trying to start the Apache-drill but I am facing this error continuesley. How can I figure it out? Error: Failure in starting embedded Drillbit: java.io.IOException: Failed to bind to 0.0.0.0/0.0.0.0:8047 (state=,code=0) I tried to edit…
Amin Khodamoradi
  • 392
  • 1
  • 6
  • 18
0
votes
2 answers

Load csv into pandas dataframe from Pydrill Query

I am able to load a csv into pandas dataframe, but it is stuck in a list. How can I load directly into a pandas dataframe from Pydrill or unlist the pandas dataframe columns and data? I've tried unlisting and it puts everything into a list of a…
0
votes
0 answers

how to force header order in result of a select?

I developed 2 codes but the goal is the same. The first, it submit a query to apache drill by pydrill. The query in this case is a lot of select comands and a union all between them, then I save the result in a dataframe. The second code, it submit…
ltito
  • 13
  • 3
0
votes
1 answer

Select binary data from Parquet using Drill

I have a parquet dataset where I saved a byte_array. I am using Apache Drill to query the dataset: SELECT id, x, y FROM `dfs.root`.`./data` This gives me: +--------------------------------------+-------------+-------------+ | ID …
user1302023
  • 31
  • 1
  • 1
  • 9