11

I'm scraping a static html site and moving the content into a database-backed CMS. I'd like to use Textile in the CMS.

Is there a tool out there that converts HTML into Textile, so I can scrape the existing site, convert the HTML to Textile, and insert that data into the database?

Joe Van Dyk
  • 6,828
  • 8
  • 57
  • 73

5 Answers5

1

I know this is an old question, but I found myself trying to do this the other day and not finding anything useful, until I found Pandoc. It can convert loads of other markup formats as well - it's quite brilliant.

slightlyfaulty
  • 1,401
  • 14
  • 13
  • But you will loose styles and other things. Then you might as well convert to Markdown. – Bruno Dec 06 '16 at 00:04
0

Here is a c# lib converting html 2 textile. Though it is textile with their additions. Not pure textile.

user48841
  • 131
  • 2
0

Since there was no javascript implementation, I wrote one: https://github.com/cmroanirgo/to-textile

It's a little primitive at the moment, as it's a blind port of the 'to-markdown' equivalent, but should get the job done.

cmroanirgo
  • 7,297
  • 4
  • 32
  • 38
-2

This is a simple markup replacement, nothing a good regex could not fix.

I recommend Perl, LWP::Simple and some regexes to do the whole thing (spidering, stripping design and menus, converting to textile, and then posting to the database.)

Osama Al-Maadeed
  • 5,654
  • 5
  • 28
  • 48
-2

try this simple java code hope it work for you

import java.net.*;
import java.io.*;

class Crawle
{

public static void main(String ar[])throws Exception
{


URL url = new URL("https://www.google.co.in/#q=i+am+happy");
InputStream io =  url.openStream();
BufferedReader br = new BufferedReader(new InputStreamReader(io));
FileOutputStream fio = new FileOutputStream("crawler/file.txt");
PrintWriter pr = new PrintWriter(fio,true);
String data = "";
while((data=br.readLine())!=null)
{
pr.println(data);
System.out.println(data);
}

}
}
}
Simmant
  • 1,477
  • 25
  • 39
  • 1
    Has nothing to do with Textile – cmroanirgo Mar 21 '17 at 09:22
  • As per the question he wants to crawler any website page and then need to save into any textfile, so what I posted in my answer is related to that only. And In my answer I had shared the simple example for above query. I am still not getting reason of downvote. – Simmant Mar 22 '17 at 10:16