12

I've seen it's possible to make a connection between Mathematica and MySQL databases using Input Needs["DatabaseLink"] and conn = OpenSQLConnection[JDBC["MySQL(Connector/J)", "yourserver/yourdatabase"], "Username" -> "yourusername", "Password" -> "yourpassword"] (in case anyone wants to give it a try). Documentation of DatabaseLink here, by the way.

Does anyone have experience using Mathematica in this way, probably to analyze data contained in the database? Are there obvious drawbacks (speed, memory needed, etc.)?

Matthew Strawbridge
  • 19,940
  • 10
  • 72
  • 93
r_31415
  • 8,752
  • 17
  • 74
  • 121
  • I was using it for analyzing metadata of a pretty large industrial Oracle database, in the context of semi-automatic correction of XML Hibernate mappings (the Java project for which I needed it used Hibernate as an ORM layer) due to the changes in a database (table names, column names, etc). It worked great for me, but since I was mostly interested in metadata, I did not do extensive data imports from a database to Mathematica or commits back to db, so can not comment on performance. I expect it to be decent though. – Leonid Shifrin Jul 28 '11 at 21:44
  • Thank you very much for your input. Actually, you addressed an issue I was considering, that is, make insertions back to the database. Given the wide array of data manipulations Mathamatica can accomplish, surely there are some applications using insertions via Mathematica. Glad to know it can work great in other scenarios, although performance is certainly an issue. – r_31415 Jul 28 '11 at 22:50
  • Not about MySQL, but also worth looking at if you want fast database access is [QLink](http://science.sander.su/QLink.htm) for [TokyoCabinet](http://www.igvita.com/2009/02/13/tokyo-cabinet-beyond-key-value-store/). – Simon Jul 29 '11 at 01:24
  • BTW It seems that Tokyo Cabinet has been superseded by [Kyoto Cabinet](http://fallabs.com/kyotocabinet/)... maybe QLink needs updating. These types of databases are natural for storing Mathematica data in that "The database is a simple data file containing records, each is a pair of a key and a value... There is neither concept of data tables nor data types..." – Simon Jul 29 '11 at 01:33
  • @Simon - Thanks a MILLION for sharing the links. I've never come across these database tools before (and believe me, I do a LOT of fishing for Mathematica code/goodies). You should add them to the Wiki if you haven't already done so. You might also mention Sal Mangano's presentation on Kdb+ (video on YouTube). Cool stuff. – telefunkenvf14 Aug 03 '11 at 00:53
  • @telefunkenvf14: Not a problem - sharing links is easy! Which wiki are you talking about? – Simon Aug 03 '11 at 04:45
  • You're welcome. Mine was just a tweak of @esmit's edit, which I approved. – Matthew Strawbridge Jul 24 '13 at 05:58

2 Answers2

10

I have not used DatabaseLink` with MySQL, but I have used it extensively with Oracle, SQL Server and HSQLDB. Most of my DatabaseLink` usage has been on Mathematica versions 6 through 8, on Windows. I have used it for both reading and writing, sometimes with very large data sets. My experience has been that the facility works as expected. I did not run into any unusual situations involving performance or memory. I would expect you to find SQL access in Mathematica to be comparable to what you have experienced in other languages. In particular, it performs very much like using Java -- presumably because Mathematica uses Java internally to do all of the heavy lifting.

I will mention that the "obvious" way to execute queries in Mathematica, SQLExecute, will read the entire result set into memory. If you want work with large result sets without running out of memory, take a look at the result set functions SQLResultSetOpen et al.

For a very basic example of SQL use in Mathematica (using HSQLDB), see my response to The best way to construct a function with memory.

Community
  • 1
  • 1
WReach
  • 18,098
  • 3
  • 49
  • 93
  • Great answer and good to know about Result Sets (http://reference.wolfram.com/mathematica/DatabaseLink/tutorial/ResultSets.html ). By the way, what did you do with your data in Mathematica and after using Mathematica? – r_31415 Jul 28 '11 at 23:48
  • @Robert I don't have a good short answer to that question. Examples include data-mining operational data from a software development team, or migrating data across structural changes to a complex database model. In those examples, Mathematica was used principally for visualization, cross-technology data access and prototyping. – WReach Jul 29 '11 at 01:39
  • No worries about not giving a 'good short answer'. I just wanted to know the performance needed from your application. Thanks a lot. – r_31415 Jul 29 '11 at 04:38
7

I recently used databases to speed up a Manipulate[] block.

Without the database, essential data from a 150 MB ASCII file were required in memory for access. As a result, the Manipulate[] block slowed down. It's possible that PackedArray[] would have helped. I didn't investigate this.

With the database, the speed of access of individual datasets is slightly slower than a Select[] block, but the memory footprint is down by a factor of nearly 10.

I'd say go for it.

dwa
  • 504
  • 5
  • 12
  • Thanks for your answer. Uhm, it seems that Mathematica is well equipped to deal with this situation. I'm surprised its use isn't more widespread. I will ask you the same question I asked to WReach. What did you do after you manipulated your data with Mathematica? – r_31415 Jul 28 '11 at 23:53
  • @Robert Smith: The application is QC of airborne geophysical data. After I select a line, I need to plot 4 graphs, two line-based plots & at least one point-based plot so I can figure out which data points to omit from a least-squares inversion scheme. The Manipulate[] block is over lines, then over stations in each of those lines, 1 -> 37, then 1 -> (up to 5000). The completed application will write a bunch of integers back to the database indicating which channels should be kept. I'm using a sqlite database. The only unsavory aspect was writing the db in Mathematica . – dwa Jul 29 '11 at 01:00
  • After your comment, it seems that using Mathematica for a relatively intensive task it's not too bad. Thanks a lot! – r_31415 Jul 29 '11 at 04:40
  • By the way, I will accept this answer because it was the first to be posted, however, all the answers and comments were extremely useful. – r_31415 Jul 29 '11 at 04:42