0

Apologies for the re-post; the earlier time I'd posted I did not have all the details.

My colleague, who quit the firm was a C# programmer, was forced to write Java code that involved (large, dense) matrix multiplication.

He's coded his own DataTable class in Java, in order to be able to

a) create indexes to sort and join with other DataTables

b) do matrix multiplication.

The code in its current form is NOT maintainable/extensible. I want to clean up the code, and thought using something like R within Java will help me focus on business logic rather than sorting, joining, matrix multiplication, etc.

Plus, I'm very new to the concept of DataTable; I just want to replace the DataTable with 2D arrays, and let R handle the rest.

(I currently do not know how to join 2 large datasets in java very efficiently

Please let me know what you think. Also, are there any simple examples that I can take a look at?

Chapax
  • 1
  • There are matrix/data table tools for most programming languages. While some are better than others, it might be worthwhile thinking about what skills you have in the office. If you quit/get hit by a bus, will anyone else be able to follow what you have done? If you have plenty of R coders, then go ahead with your plan. If everyone else is tied to Java, stick with that. – Richie Cotton May 07 '10 at 10:32
  • I frankly don't see how this question is different than your previous one: http://stackoverflow.com/questions/2658752/matrix-multiplication-in-java/. Why not just ask a specific question (e.g. how does one do matrix multiplication in R through Java) rather than having multiple posts? – Shane May 07 '10 at 19:13

3 Answers3

1

Don't take this too harshly but you seem to be preparing to replace one chunk of unmaintainable code with another chunk of unmaintainable code. How do I reach this remarkable conclusion ? By your own admission your Java expertise is not quite up to the task you face and you propose to replace a pure Java solution with Java+R.

I suggest that you identify your core skills and use whatever toolset you are most comfortable with to refactor the code. If you don't I foresee a post on SO in a year or so from your replacement complaining about the unmaintainable code you left behind !

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
  • Well said ... I accept what you're saying to a certain extent ... but believe me, there are more than just these issues bogging me down at the moment. I precisely do not want to get to another set of unmaintainable code ... just want a clean and elegant solution. I have no reason to be writing code to unnecessarily sort, matrix multiply, when already such solutions exist. I do however have other issues that I'm interested to solve. – Chapax May 07 '10 at 12:42
0

Mahout implements matrix and vector operations of this type. It also supports dsitributed, large-scale matrix operations though you may want to ask around on the mailing list for guidance on how to use this pretty new code.

Sean Owen
  • 66,182
  • 23
  • 141
  • 173
0

Here are some options: Parallel Colt is a numerics library for Java, and Incanter is an R-like system that runs on the JVM.

Jouni K. Seppänen
  • 43,139
  • 5
  • 71
  • 100