Questions tagged [dagster]

Dagster is an open source system for building modern data applications.

Dagster, by Elementl, is a set of abstractions for building self-describing, testable, and reliable data applications. It uses functional data programming, gradual/optional typing, and testability to facilitate composition of data applications from DAGs of solids, its basic computational unit.

142 questions
6
votes
1 answer

How would you parameterize Dagster pipelines to run same solids with multiple different configurations/assets?

Let's say I create a Dagster pipeline with the following solids: Execute SQL query from file and get results Write results to a table I want to do this for say 10 different tables in parallel. Each table requiring a different SQL query. What would…
Binh Pham
  • 123
  • 1
  • 8
5
votes
3 answers

Start Dagster Schedule automatically

Hi i am learning dagster and i want help with starting schedule I am able to add and start schedule in dagit but i want to start schedule automatically instead of turn on every schedule from dagit. #Here is my code @solid() def test(context): …
ZeroThree
  • 51
  • 3
5
votes
2 answers

Dagster loop over solid's output and concurrent processing

I have a Dagster pipeline consisting of two solids (reproducible example below). The first (return_some_list) outputs a list of some objects. The second solid (print_num) accepts an element from the first list (not the full list) and does some…
cyau
  • 449
  • 4
  • 14
4
votes
2 answers

why does kubernetes delete secrets after helm upgrade?

when performing helm upgrade, I find that secrets that are created upon initial install are deleted. Why is this? The example I am using is dagster. When installing with: helm install dagster dagster/dagster \ …
pomply
  • 93
  • 5
4
votes
1 answer

How to use dictionary yielded from other solid in a composite Solid?

For example I have a solid named initiate_load , it is yielding a dictionary and an integer , something like : @solid( output_defs=[ OutputDefinition(name='l_dict', is_required=False), OutputDefinition(name='l_int',…
Atif
  • 1,012
  • 1
  • 9
  • 23
4
votes
1 answer

How to avoid running the rest of a dagster pipeline under certain conditions

say I have two solids in Dagster connected on a pipeline. The first solid may do some process and generate a valid input so that the rest of the pipeline executes, or generate an invalid input that should not be further processed. To achieve this…
ElBrocas
  • 399
  • 4
  • 13
3
votes
1 answer

dagster can you trigger a job to run via an api?

I have been looking all over for the answer, but can't seem to find what I'm looking for I want to create an api endpoint that can pass information to the dagster assets and trigger a run. For example, I have the following asset in…
3
votes
1 answer

Is it possible to generate jobs in Dagster dynamically using configuration from database

Currently, my database has multi departments. I need to apply a data pipeline to all of these departments with different configurations. I want to load configurations for each department from a database. Then use these configuration to generate a…
highDopamine
  • 97
  • 10
3
votes
1 answer

Is it possible to create dynamic jobs with Dagster?

Consider this example - you need to load table1 from source database, do some generic transformations (like convert time zones for timestamped columns) and write resulting data into Snowflake. This is an easy one and can be implemented using 3…
mishkin
  • 5,932
  • 8
  • 45
  • 64
3
votes
1 answer

How to create partitions with a schedule in Dagster?

I am trying to create partitions within Dagster that will allow me to do backfills. The documentation has an example but it's to use the days of the week(which I was able to replicate). However, I am trying to create partitions with…
sgruskin
  • 51
  • 3
3
votes
2 answers

Caching Dagster's pipeline results

Is there a way to cache the output of the solids in the pipeline in such a way that if I run the same pipeline but with a slightly different configuration (think hyper-parameter tuning), certain initial steps in the pipelines that are unaffected by…
moomima
  • 1,200
  • 9
  • 12
3
votes
2 answers

Integrating Dagster with Django

Hi I am trying to integrate Dagster into ongoing Django project. I am kind of struggling with providing Django context (models, apps, ...) to Dagster. As of now I am just checking wether dagit is present in sys.argv[0] in init.py of apps that are…
2
votes
0 answers

How not to overwrite materialized assets in Dagster

I'm new to Dagster. According to the docs, when you use the built-in filesystem IO manager: Subsequent materializations of an asset will overwrite previous materializations of that…
2
votes
1 answer

how to iterate over a list of values returning from ops to jobs in dagster

I am new to the dagster world and working on ops and jobs concepts. \ my requirement is to read a list of data from config_schema and pass it to @op function and return the same list to jobs. \ The code is show as…
Piyush Jiwane
  • 179
  • 3
  • 13
2
votes
1 answer

Dagster sensor to check for new records in a table

I have 2 tables where 2nd is dependent on 1st. Whenever new records are added in 1st, I want to run a dagster job. I came across sensors but I am not sure if my requirement can be fulfilled using the functionality they provide. Any ideas?
Abi
  • 83
  • 6
1
2 3
9 10