2

I'm investigating Google refine to speed up some of my data work -- never used it before this week, but I like a lot of what I see.

My biggest question so far is whether it's possible to call external python functions from Refine. I know you can call jython internally, but that doesn't provide access to C-based python libraries (e.g. lxml), and I have scripts elsewhere that I'd like to integrate, without lots of copy-paste or rewrite hassle.

What options are there for doing this in Refine? I'm willing to get creative -- I just want a stable, re-usable solution.

pnuts
  • 58,317
  • 11
  • 87
  • 139
Abe
  • 22,738
  • 26
  • 82
  • 111

2 Answers2

2

As Google Refine Wiki says:

lxml will NOT work in Jython, since lxml has C bindings for CPython (regular Python), hence will not work in Refine which is Jython / Java only, and has no CPython interpreter built-in

But you can try Google Refine Python Client Library to create projects and manipulate your data programmatically.

reclosedev
  • 9,352
  • 34
  • 51
  • Yes, I've read this part of the documentation. I'm asking the opposite question: not, "how to call refine from python," but "how to call python from refine." – Abe Feb 02 '12 at 19:22
  • @Abe, I think that `...and has no CPython interpreter built-in` means that it's impossible. But probably you can call external processes (e.g. Python scripts) from Jython but functions. – reclosedev Feb 02 '12 at 20:08
1

I'm going to mark reclosedev's answer as accepted, but there's still a litle more to the story.

The other answer to this question is that you can set up your own python-based API. For this project, I was able to set up a django app running on a local server. It only took an hour or so to build the API to my existing library.

More hassle than I'd have liked, but it fit the bill for this project without soaking up too much time.

Abe
  • 22,738
  • 26
  • 82
  • 111