0

I have uploaded 1 PDF then convert it to xlsx file. I have tried different ways but not getting actual output.pdf2xls only displays single line format not whole file data. I want whole PDF file data to display on xlsx file.

i have one method convert PDF to xlsx but not display proper format.

def do_excel_to_pdf
    @user=User.create!(pdf: params[:pdf])
    @path_in = @user.pdf.path
    temp1 = @user.pdf.path
    @path_out = @user.pdf.path.slice(0..@user.pdf.path.rindex(/\//))
    query = "libreoffice --headless --invisible --convert-to pdf " + @path_in + " --outdir " + @path_out
    system(query)
    file = @path_out+@user.pdf.original_filename.slice(0..@user.pdf.original_filename.rindex('.')-1)+".pdf"
    send_file file, :type=>"application/msexcel", :x_sendfile=>true
end

if any one use please help me, any gem any script.

2 Answers2

1

I would start with reading from the PDF, inserting the data in the XLSX is easy, if you have problems with that ask another question and specify which gem you use and what you tried for that part.

You use libreoffice to read the PDF but according to the FAQ your PDF needs to be hybrid, perhaps that is the problem.

As an alternative you could try to use some conversion tool for ebooks like the one in Calibre but I'm afraid you will lose too much formatting to recover the data you need.

All depends on how the data in your PDF is structured, if regular text without much formatting and positioning it can be as easy as using the gem pdf-reader

I used it in the past and my data had a lot of formatting - you would be surprised to know how complicated the PDF structure is - so I had to specify for each field at which location exactly which data had to be read, not for the faint of heart.

Here a simple example.

require 'pdf/reader' # gem install pdf-reader

reader = PDF::Reader.new("my.pdf")
reader.pages.each do |page|
  # puts page.text
  page.page_object.each do |e|
    p e.first.contents
  end
end
peter
  • 41,770
  • 5
  • 64
  • 108
0

not able to find options to convert from PDF to xsls but API Options available for converting PDF to Image and PDF to powerpoint(Link Given Below) Not sure u can change the requirement to show results in other formats!!

http://www.convertapi.com/

BEECEE
  • 66
  • 2
  • you are right but i don't use any api for convert xlsx file like i am convert pdf to excel using " unoconv -d document --format=pdf File Name" – Chaudhary Prakash Oct 18 '16 at 07:29