8

First some background; we are developing a datawarehouse and doing some research on what tools to use for our ETL process. The team is very developer centric, everyone is knowledgeable with C#. So far I have looked at RhinoETL, Pentaho (Kettle), Astrix Centerprise. SSIS is out for a number of reasons which are outside the scope of this question.

At this time, I am leaning towards something more developer oriented like RhinoETL because it seems like the path of least resistance for a group of devs. Do the other more visual designer oriented products bring anything to the table that RhinoETL doesn't? Are there any specific things I should be paying attention to when evaluating these ETL tools? Are there any other tools that we should also investigate?

Matt
  • 257
  • 3
  • 9
  • Have you tried IBM's Design studio? –  Oct 04 '11 at 00:00
  • 2
    As I can see you are interested in open source solution. I don't know about Rhino, but You might want to reconsider discarding visual designer as an option - maybe the problem is that you are programming-oriented. It is easier to maintain than reading code. Also, consider a scope of your project before making a final decision. Check Talend - it is written in JAVA and you can write your own transformations in JAVA. – Filip Popović Oct 23 '11 at 16:35

2 Answers2

5

I know this is a late answer, but as I needed a proper Elt with all SSIS features but in a 100% .net environment, I came up developing my own.

For sure, performances are not as good as SSIS. I believe that if you want massive performances for huge volumes to integrate and transform, you should still use SSIS.

The main thing that I really needed that no other kinda-etl tool like RhinoEtl provides, is a proper tracing system that permits to have traces of any single details that is easily manipulate to record if necessary. I made lot of out of the box adapters for file system, ftp, sftp, xml, csv, entityframework core and bulk load. I even came up with a visual tool to view the structure of the transformation process.

It took me 10 months so far, and I open sourced it. It still lacks a lot of documentation (huge work to achieve). I must complete it with a much bigger set of unit tests (also huge work to achieve) for me to decently release it in beta version. Even if I still left it in alpha version, it is the foundation of all ETL processes of my company, and it works like hell!

Stephane
  • 1,359
  • 1
  • 15
  • 27
3

Recently my coworker and I did some simple performance testing between RhinoETL and SSIS. It seem that for simple data flows SSIS always outperformed RhinoETL (moves 2,000,000 records about 30% faster). If you are using source control (in our case TFS), you can not easily view differences between versions of dtsx files (SSIS files), where developing with RhinoETL allows you to utilize TFS features.

Another advantage RhinoETL has is seen if you develop a User Interface on top of your data warehouse. You can share code between these two programs.

Although several of the members of our SSIS team come from .Net backgrounds, our management decided to continue developing with SSIS (although they upgraded to SSIS 2008 --another topic altogether) because they felt it was easier to have a developer learn SSIS than .Net.

M.Babcock
  • 18,753
  • 6
  • 54
  • 84
David Benham
  • 1,164
  • 11
  • 17
  • 2
    Since you're using TFS, check out [BIDSHelper](http://bidshelper.codeplex.com/) They have a Smart Diff feature that excludes SSIS noise, like layout changes. Makes it much easier to figure out if something important changed beteween revisions. – billinkc Nov 09 '11 at 15:56
  • @billinkc You sir, just gave me another point for my business case for getting that piece of software approved. Thank you. – David Benham Nov 09 '11 at 17:11
  • @billinkc Oh my god! That tool is awesome! – Kenny Eliasson Dec 11 '12 at 12:11