Questions tagged [incremental-load]
33 questions
5
votes
1 answer
Is there something like Glue "Bookmark" feature in spark which keeps track at job level?
I am looking to see if there is something like AWS Glue "bookmark" in spark. I know there is checkpoint in spark which works well on individual data source. In Glue we could use bookmark to keep track of all the files across different tables…

VE88
- 125
- 1
- 5
4
votes
1 answer
while doing incremental using dbt i want to to aggregation if that row exist else insert
I am using DBT to incremental load data from one schema in redshift to another to create reports. In DBT there is straight forward way to incrementally load data with upsert. But instead of doing the traditional upsert. I want to take sum (on the…

isrj5
- 375
- 1
- 2
- 14
4
votes
1 answer
Delta Live Tables for Batch Incremental Processing
Is it possible to use Delta Live Tables to perform incremental batch processing?
Now, I believe that this code will always load all of the data available in the directory when a pipeline is run,
CREATE LIVE TABLE lendingclub_raw
COMMENT "The raw…

Minura Punchihewa
- 1,498
- 1
- 12
- 35
2
votes
0 answers
Duplicates in Snowflake Stream
With the setting SHOW_INITIAL_ROWS = TRUE, we created a stream on top of a view (which has many joins).
We created a Stored procedure with a single merge statement that ingests all of the data from the stream into a target table. Following is the…

shiva nagesh
- 87
- 7
2
votes
0 answers
SAP incremental data load in Azure Data Factory
I'm trying to implement an Extractor pipeline in ADF, with several Copy Data activities (SAP ERP Table sources). To save some processing time, I'd like to have some deltas (incremental load). What's the best way to implement this?
What I'm trying at…

DavideVaz
- 21
- 1
2
votes
1 answer
ADF to Snowflake incremental load and streams
I am trying to load files from my Azure blob to Snowflake table incrementally. After which in snowflake, I put streams on that table and load the data to the target table.
I am unable to do incremental load from Azure to Snowflake. I have tried many…

Coder1990
- 89
- 8
1
vote
1 answer
Is there a way to make the dbt_cloud_pr_xxxx_xxx a clone of an existing data?
so using dbt cloud, and having a run on every pull request, but my incremental models are fully refreshed since everything runs in a new db destination (dbt_cloud_pr_xxxxx_xxx) any way of solving this? perhaps creating the new destination as a clone…

Ezer K
- 3,637
- 3
- 18
- 34
1
vote
1 answer
Displaying images in gridview using incremental loading
I have a gridview that displays 435 images on a local package. I tried using Incremental Loading.
XAML:
…

Rose
- 613
- 4
- 22
1
vote
4 answers
Power BI Athena Incremental Refresh
I have been successfully using Power BI’s incremental refresh daily with a MySQL data source. However, I can't get this configured with AWS Athena, because seemingly the latter interprets the values in the required parameters RangeStart and RangeEnd…

Ricky McMaster
- 4,289
- 2
- 24
- 23
0
votes
0 answers
How to implement increment load in pentaho (spoon)
I want to implement increment load in pentaho. I have two tables in my OLTP and I want to apply left join them and drop them as single table in OLAP. OlTP and OLAP are in different database connection in mysql means there are two different database…

ahmed
- 1
- 1
0
votes
0 answers
How to perform incremental load in snowflake
I have a table T1 in Snowflake that get's truncated and loaded with data weekly. I have to create another table T2 where I should pass all the initial full load from T1 to T2. Then after each week load in T1, T2 table also gets inserted or updated…

Devaraj Mani Maran
- 59
- 6
0
votes
1 answer
How to load data from github graphql using since like rest API
I have written a pipeline to load issues from GitHub to big query; I want to make it incremental, for example, load only the data from the last run to the present run; I tweaked the pipeline code to pass since arg, but I don't know if the graphql…

Aman Gupta
- 15
- 4
0
votes
1 answer
Add additional header from previous activity to rest call in copy data activity
I've a pipeline which should sync data from a REST API source to a SQL table. There are 2 steps in this pipeline:
Get the last changed datum field from the data set in the previous run, so that I know that I have to sync all records which got…

AntonyJ
- 29
- 6
0
votes
0 answers
Calculating count of records and then appending those counts daily in a separate dataset using pyspark
I have a dynamic dataset like below which is updating everyday. Like on Jan 11 data is:
Name
Id
John
35
Marrie
27
On Jan 12, data is
Name
Id
John
35
Marrie
27
MARTIN
42
I need to take count of the records and then…
0
votes
0 answers
incremental load to s3 using python
I am looking for steps and few code to write incremental load/ingestion to historical load in S3 using python.
please any one help me
Need small help in incremental help

Ravindu Mysore
- 1
- 1