11

I am considering a tool for an ETL solution that has high daily demand and requires heavy business logic processing. I've tried kettle and SSIS so far, and also want to test for Rhino ETL. I don't care for the visual flow structure of both Kettle and SSIS and creating complex businesse rules seems really hard using them... Rhino ETL seems more friendly as it has its own DSL to transform the data and I can also use C#.

Finally, my question is: Anyone uses Rhino ETL heavily? It has good performance compared to Kettle and SSIS? How about maintainability?

Thanks

UPDATE:

In comparisons that I made between Kettle and SSIS, Kettle was, without a doubt, better. I am considering Rhino ETL for its pragmatic approach compared to Kettle. As said in the comments, it seems a step backwards, but the kind of validation needed isn't the kind of problem Kettle is recommended for. For example, one of our integrations receives some kind of schedules that must be validated against the existing ones in the system, they must not conflict, there are several types of schedule and the conflict validation rules are complex. The system already has an User Inteface to do it, and the business logic is already implemented in C# code. Any attempt to port it to Kettle seems incredibly difficult, besides, it violates the 'only one way to do a thing' principle.

The 'no one uses' problem adressed in the comments is a concern for me too, that's why I am here trying to find out if anyone uses it in heavy production environment.

Thanks for the feedback so far.

Pedro
  • 11,514
  • 5
  • 27
  • 40
  • 2
    SSIS, while not perfect, can do Fuzzy Matching. Try doing that in Rhino or Kettle. While SSIS is not perfect, it's far from horrible. – Keith Adler Feb 25 '10 at 17:30
  • Never heard of Rhino ETL - thanks for pointing it out. There doesnt seem to be much material about it though, and there doesnt seem to be a large community around it? That would probably make me decide against it. But anyway, from what I can tell, it looks like you're basically back to programming, which I consider a step backward as compared to both Kettle and SSIS (nothing todo with the GUI). Can you give us examples of these business rules that are too complex to build in SSIS and Kettle? – Roland Bouman Feb 25 '10 at 17:30
  • One point about performance - what I like about kettle is that you can transparently scale up the transformation, either by assigning more threads to the steps (thus utilizing more of your cores), or by clustering (or both). Perhaps SSIS has a similar feature, but I think that this feature is key to achieving good performance on the long term. I have trouble identifying if Rhino ETL has a feature like this, but this is another thing that would influence my decision if I'd be in your shoes. – Roland Bouman Feb 25 '10 at 17:41
  • @Sergey: partially agree when scalability is needed @Nissan: Fuzzy Matching isn't in my transformations plans.. but thanks for pointing it out. @Roland: Please check my update – Pedro Feb 25 '10 at 18:50
  • @Nissan Fan I've simply roll my own Fuzzy Lookup in couple of days. It is easy for person with knowledge of string similarity algorithms. – Sergey Mirvoda Feb 25 '10 at 19:38
  • You do know you can use c# in SSIS? – HLGEM Feb 25 '10 at 19:49
  • @HLGEM: Yes, I do, but I don't like SSIS in general, and SSIS's concept in using C# has a script orientation that I also dislike. – Pedro Feb 25 '10 at 20:00
  • @Pedro, it's "script" only in the sense that it's code. It is fully compiled and executed natively at runtime. – Keith Adler Feb 25 '10 at 20:51
  • @Sergey ... yes, but there's nothing in Rhino or Kettle out of the box for this. It's certainly worth considering in any comparison. – Keith Adler Feb 25 '10 at 20:51
  • @Nissan Fan yes sure, no doubt, great feature. – Sergey Mirvoda Feb 25 '10 at 20:56
  • Roland: Rhino ETL has a threaded job runner, yes. Works great: I see it hit very high CPU usage on all 8 cores I've got here. – Ken Jul 21 '10 at 22:40

2 Answers2

3

As for RhinoETL and Kettle.
Rhino is very developer oriented.
Kettle is more skilled administrator or very skilled BA oriented. Kettle GUI is far from intuitive, but Kettle capabilities is great.

We've developed our own ETL engine (simply didn't knows about Kettle) and our product is very similar to Kettle capabilities and architecture, but more user and our business friendly and/

SSIS - no comments here. DTS was a great product, simple and powerful, SSIS is horrible...

All opinions are subjective.

Newbie
  • 7,031
  • 9
  • 60
  • 85
Sergey Mirvoda
  • 3,209
  • 2
  • 26
  • 30
  • 3
    Why bother saying SSIS is horrible if you won't say why it's horrible? I happen to think you must be gone astray if you prefer the kind of hacks necessary in DTS to the clear control and data flow in SSIS. – John Saunders Feb 25 '10 at 19:50
  • The objective is to be developer oriented, no DBA or admins interference at all... We look for robust maintainable ETL code, and both Kettle and SSIS seems to lack this... – Pedro Feb 25 '10 at 20:02
  • @John do not compare simplicity with power. As for drugs check out this links or simply STFG on Microsoft Connect and Web. http://ayende.com/Wiki/I+Hate+SSIS.ashx http://ayende.com/Blog/archive/2006/01/12/SSISDebuggingFrustrations.aspx http://ayende.com/Blog/archive/2007/07/27/SSIS-The-backlash.aspx – Sergey Mirvoda Feb 25 '10 at 20:49
  • @Pedro hm, our ETL goals not same as your goals, after rereading question I think you should check workflow engines. – Sergey Mirvoda Feb 25 '10 at 20:54
  • 1
    @Sergey: the "drugs" comment was about liking DTS better. No drugs are required for simply disliking SSIS. I read those links, and Ayende should have got help (from Jamie Thompson, for instance), when he was learning SSIS. He raves like someone who tried to figure it out without the manual or any other help. The first five or six things he says are dead wrong. I gave up after that. Note he also doesn't say DTS was better. That opinion requires ... a different reality, hopefully induced by drugs (because one can stop doing drugs). – John Saunders Feb 25 '10 at 23:08
  • I agree with John Saunders, everyone knocks SSIS but simple states it's terrible or the worst thing ever, but no reasoning. In truth, I like SSIS the best, mainly because of it's ability to load into SQL Server faster, and I've used all these ETL engines plus a couple more. – ajdams Mar 16 '10 at 20:28
2

I use it solely for loading data into a data warehouse. As these things go, it's pretty small, the daily load "only" takes 15 minutes, though I know of people using rhino to process data over days.

I've always had good responses from the mailing list, there's a core of users there. Being able to test all the operations independently is a real boon.

Under the hood it's actually refreshingly straightforward really.

mr_miles
  • 93
  • 8
  • Good to know, thanks. We are now using Kettle, but anyway it is good to know that someone actually uses it. Life is long, many projects yet to come... Maybe next time I'll go with Rhino ETL. :) – Pedro May 20 '11 at 20:33