Questions tagged [kiba-etl]
36 questions
7
votes
1 answer
How to do a aggregation transformation in a kiba etl script (kiba gem)?
I want to write a Kiba Etl script which has a source from a CSV to Destination CSV with a list of transformation rules among which the 2nd transformer is an Aggregation in which operation such as select name, sum(euro) group by name
Kiba ETL Script…

Umar Siddiqui
- 73
- 4
4
votes
2 answers
Should I use Rails for consistency? (for ETL project)
CONTEXT
I'm new to Ruby and all that jazz, but I'm not new to dev.
I'm taking over a project based on 2 rails/puma repositories for web & APIs.
I'm building a new repository for a backend data processing app, using Kiba, that will run through…

Tristan M
- 133
- 1
- 3
4
votes
1 answer
Modify a range of rows after applying transformations
Modify a range of rows after applying transformations
I want to write a kiba transformation that allows me to insert the same information for an specific number of rows. In this
case i have an xls file that contains subheaders, and this subheaders…

José Añasco
- 43
- 3
3
votes
2 answers
Transforming a table into a hash of sets using Kiba-ETL
I'm busy working through an ETL pipeline, but for this particular problem, I need to take a table of data, and turn each column into a set - that is, a unique array.
I'm struggling to wrap my head around how I would accomplish this within the Kiba…

Gabriel Fortuna
- 122
- 7
3
votes
2 answers
How to pass parameters into your ETL job?
I am building an ETL which will be run on different sources, by a variable.
How can I execute my job (rake task)
Kiba.run(Kiba.parse(IO.read(etl_file),etl_file))
and pass in parameters for my etl_file to then use for its sources?
source…

ElderFain
- 93
- 5
2
votes
1 answer
Is there an obvious way to reduce rows when using Kiba?
Firstly - Thibaut, thank you for Kiba. It goes toe-to-toe with 'enterprise' grade ETL tools and has never let me down.
I'm busy building an ETL pipeline that takes a numbers of rows, and reduces them down into a single summary row. I get the feeling…

Gabriel Fortuna
- 122
- 7
2
votes
1 answer
can I run Kiba job inside rails service?
Iam running kiba job from rails service that is called inside controller.
Here is current code.
class KibaRunner
attr_reader :job,:logger
def initialize(job)
@job = job
@logger = Rails.logger
end
def run
logger.info "Running…

Jurot King
- 79
- 7
2
votes
1 answer
Saving and loading etl pipeline from database
My current task is to make a rails application wherein users can create connections from rdbms(for mysql,pg etc.) and s3 (for csv and json).
User can add etl job. An etl job can have multiple pipelines in the future but single for now.
A pipeline…

Jurot King
- 79
- 7
2
votes
1 answer
Best practice for using Kiba as a batch process on files
We'd like to run Kiba as a batch process on a series of files. What would be the best structure to give a file mask, download the files from FTP, and then run the ETL job on each, sending a success or failure notification on a per file basis?
Is…

Steve Wetzel
- 435
- 4
- 9
2
votes
1 answer
Can I duplicate rows with kiba using a transform?
I'm currently using your gem to transform a csv that was webscraped from a personel-database that has no api.
From the scraping I ended up with a csv. I can process it pretty fine using your gem, there's only one bit I am wondering
Consider the…

Andy
- 23
- 2
2
votes
1 answer
Is it possible to do a Lookup use Kiba
Is it possible to do a "Lookup" with Kiba.
Since it's quite a normal process in a etl.
Could you show a demo if yes, thanks.

L_G
- 209
- 2
- 10
2
votes
0 answers
Pass Parameters to Kiba run Method
I'm trying to use something similar to the code that's used for the kiba cli programmatically as ...
filename = './path/to/script.rb'
script_content = IO.read(filename)
job_definition = Kiba.parse(script_content, filename)
…

slabounty
- 704
- 11
- 21
1
vote
1 answer
Is there a way to return some data at the end of a Kiba job?
It would be great if there was a way to get some kind of return object from a Kiba ETL run so that I could use the data in there to return a report on how well the pipeline ran.
We have a job that runs every 10 minutes that processes on average 20 -…

Gabriel Fortuna
- 122
- 7
1
vote
1 answer
How to filter data in extractor?
I've got a long-running pipeline that has some failing items (items that at the end of the process are not loaded because they fail database validation or something similar).
I want to rerun the pipeline, but only process the items that failed the…

Viktor
- 2,982
- 27
- 32
1
vote
1 answer
How to log "current status" of ETL job?
I'm running Kiba ETL pipeline in a rails background job. I'd like to provide some status to the user while the job is running. What would be the best way to achieve this?
Can I use some variable somehow?
Or should I save the status update in the…

Viktor
- 2,982
- 27
- 32