0

I hope you are doing well.

I need to know about a PHP library that converts a PDF file having images as well to be converted in a HTML file with the following features that the library can do.

  1. HTML file needs to be of version 3.2 compatible
  2. Save the images in PDF file having .jpg extension
  3. Correct font from PDF needs to be used in the HTML file.
  4. A result folder that contains the images and html file in one folder

I have tried most of the PHP libraries but most of the PHP libraries are NOT doing my needed tasks.

Please, help let me know about a library that do all the above 4 requirements (image attached for reference)

enter image description here

Waiting for your kind responses.

Thanks

Martin
  • 22,212
  • 11
  • 70
  • 132
Owais
  • 49
  • 12
  • I've done it for you. – Martin Jul 13 '16 at 11:39
  • HTML version 3.2 is extremely old. It was superceeded ***twenty*** years ago. If you're having issues then first call is to stop needing to use 20 year old programming and use something more recent (and therefore better supported). – Martin Jul 13 '16 at 11:40
  • Thanks for your response. Exactly, this is very old and that is why, i am not able to do it BUT it is client requirement that HTML must be 3.2. Anyways, 20 years ago, i was a baby. lol. Thanks for your brief answer. This will definitely help him out. I am onto it. – Owais Jul 13 '16 at 12:20
  • Why does the client need that specific HTML version? I have given you a fuller answer below with some useful links. – Martin Jul 13 '16 at 12:49
  • I also don't know that why he wanted such. Anyhow, I have asked him to review his requirements. Lets see what happens. – Owais Jul 13 '16 at 14:58

2 Answers2

1

I am not very sure, But here is a library in PHP I found. Here

Fida
  • 1,339
  • 1
  • 12
  • 21
0

Try this:

http://www.pdfaid.com/pdf-to-html.aspx

Or this: http://webdesign.about.com/od/pdf/tp/tools-for-converting-pdf-to-html.htm

Or this... http://www.pdfconvertonline.com/pdf-to-html-online.html

There are plenty of options available to you, the secret is to use a new fangled thing called a Search Engine, such as a Bing or a Google.

you will also do well to research on Stack Overflow before asking your question:

1) HTML 3.2 wes superceeded in 1997, this is very nearly twenty years ago, why on eart are you still needing a comparatively ancient technology when there are far better improvements available such as XML HTML, HTML 4.01 and HTML5.

2) Please read How can I extract embedded fonts from a PDF as valid font files?

3) Also to extract images you can use: http://www.makeuseof.com/tag/extract-images-pdf-files-save-windows/ but again, there are several options available to you if you care to look for them.

You seem to imply a fundamental misunderstanding about HTML; there are several different ways of getting any desired result with HTML. You have a PDF file and you want it to look a certain way, this look depends on the browser you are looking at it on. For example if you use a PDF to HTML converter as linked above you will very probably find that the output will look different on Internet Explorer 7 versus on Firefox versus Internet Explorer 10. There is no one way of writing output on HTML or with CSS.

If you want a custom built library to do your specific task then you will need to employ a professional to do it, or you will need to code it yourself. This obviously should be charged to the client for requiring a technology that is extremely outdated. You can probably search github for a similar library (the one linked by CK Khan looks like what you're after) and then fork it and make your own variation for your needs. I very much doubt anyone is going to put time into developing a system to output HTML 3.2 from a PDF, and even less likely to develop this system for free and to your exact specifications.

It also appears that you can not directly incorporate font families into the <font> tag in HTML 3.2, only being able to edit size and colour of fonts. You can use CSS1 font-family to show font families. See here.

Community
  • 1
  • 1
Martin
  • 22,212
  • 11
  • 70
  • 132