Questions tagged [duckdb]

Issues related to the usage of DuckDB (www.duckdb.org)

180 questions
7
votes
2 answers

How do I limit the memory usage of duckdb in R?

I have several large R data.frames that I would like to put into a local duckdb database. The problem I am having is duckdb seems to load everything into memory even though I am specifying a file as the location. Also, it isn't clear to me the…
Kevin
  • 107
  • 5
7
votes
1 answer

how to vacuum (reduce file size) on duckdb

I am testing duckdb database for analytics and I must say is very fast. The issue is the database file is growing and growing but I need to make it small to share it. In sqlite I recall to use the VACUUM commadn, but here same command is doing…
Forge
  • 1,587
  • 1
  • 15
  • 36
6
votes
3 answers

Using DuckDB with s3?

I'm trying to use DuckDB in a jupyter notebook to access and query some parquet files held in s3, but can't seem to get it to work. Judging on past experience, I feel like I need to assign the appropriate file system but I'm not sure how/where to do…
Ethan
  • 61
  • 1
  • 2
5
votes
1 answer

Deterministic random number generation in duckdb with dplyr syntax

How can I use duckdb's setseed() function (see reference doc) with dplyr syntax to make sure the analysis below is reproducible? # dplyr version 1.1.1 # arrow version 11.0.0.3 # duckdb 0.7.1.1 out_dir <- tempfile() arrow::write_dataset(mtcars,…
Ashirwad
  • 1,890
  • 1
  • 12
  • 14
5
votes
3 answers

Reading partitioned parquet files in DuckDB

Background: DuckDB allows for direct querying for parquet files. e.g. con.execute("Select * from 'Hierarchy.parquet') Parquet allows files to be partitioned by column values. When a parquet file is paritioned a top level FOLDER is created with the…
tomanizer
  • 851
  • 6
  • 16
4
votes
1 answer

Create an auto incrementing primary key in DuckDB

Many database engines support auto-incrementing primary keys, and I would like to use this approach in my new DuckDB approach, but I can't figure out how to set it up. For example, in MySQL: CREATE TABLE Persons ( Personid int NOT NULL…
Mark Payne
  • 557
  • 5
  • 12
3
votes
1 answer

duckdb query takes too long to process and return inside Flask application

I have a Flask app and want to use duckdb as a database for several endpoints. My idea is to query the data and return it as a .parquet file. When I test my database with a simple Python script outside of the Flask app, it can query the data and…
codeweird
  • 145
  • 3
  • 11
3
votes
1 answer

DuckDB multi threading is not Working on Google Cloud Run with multiple CPU

I have a relatively simply cloud function Gen2, which is deployed using Cloud Run regardless of how many vCPU I assigned, DuckDB seems to be using only 1 CPU ,the Memory works fine, I checked that using The Metrics Dashboard, any idea what's wrong…
Mim
  • 999
  • 10
  • 32
3
votes
1 answer

Fix unimplemented Casting error in Duckdb Insert

I am using Duckdb to insert data by Batch Insert While using following code conn.execute('INSERT INTO Main SELECT * FROM df') I am getting following error Invalid Input Error: Failed to cast value: Unimplemented type for cast (VARCHAR -> NULL) I…
3
votes
0 answers

Reading parquet format from javascript application bundled with webpack 5

I am developping a web application, in javascript, with webpack 5 bundler. I am looking for any solution so that my application can use parquet data (download on-the-fly and decode). So far, I tried the following javascript libraries: ParquetJS -…
julien
  • 898
  • 3
  • 17
  • 32
3
votes
1 answer

How can I initialize `duckdb-wasm` within NextJS?

I'm working on a NextJS project that leverages a wasm package via npm; specifically this is duckdb-wasm. duckdb-wasm needs to initialize from a set of bundles (e.g. based on browser capability). this can be done with JSDelivr or by specifying the…
maxcountryman
  • 1,562
  • 1
  • 24
  • 51
3
votes
1 answer

How can I write raw binary data to duckdb from R?

My best guess is that this simply isn't currently supported by the {duckdb} package, however I'm not sure if I'm doing something wrong/not in the in the intended way. Here's a reprex which reproduces the (fairly self-explanatory) issue: con <-…
wurli
  • 2,314
  • 10
  • 17
3
votes
1 answer

DuckDB deleting rows from dataframe error: RuntimeError: Binder Error: Can only delete from base table

I have just started using DuckDB in python jupyter notebook. So far everything has worked great. I can't figure out how to delete records from a dataframe. When I try: test_df = pd.DataFrame.from_dict({"i":[1, 2, 3, 4], "j":["one", "two",…
3
votes
1 answer

DuckDB: turn dataframe dictionary column into MAP column

I have a Pandas dataframe with a column containing dictionary values. I'd like to query this dataframe using DuckDB and convert the result to another dataframe, and have the type preserved across the query. DuckDB has the MAP data type which looks…
Aron
  • 1,552
  • 1
  • 13
  • 34
3
votes
0 answers

tableau how to connect duckdb

I download the duckdb jdbc driver and copy it to the install directory: C:\Program Files\Tableau\Drivers\duckdb_jdbc-0.2.9.jar then I start the tableau , and choose the others jdbc drivers to connect, set the configuration like this try to connect…
oneswarm
  • 31
  • 4
1
2 3
11 12