1

Is it possible to render a pdf in a browser using PERL? What I have is a flash application that sends the rendered pdf binary to perl. The pdf is generated from AlivePDF.

#!C:\Perl\bin\perl.exe
##
BEGIN { $ENV{PATH} = ''; delete @ENV{ 'IFS', 'CDPATH', 'ENV', 'BASH_ENV'}; }
use strict;
use warnings;
no warnings qw (redefine closure);
use CGI;
my $CGI = new CGI();

#name=generated.pdf&method=inline these are passed via the URL and are in the environmental variable QUERY_STRING
my %nv_pairs = map{my @tmp = split(/=/,$_);$tmp[0] => $tmp[1] }split(/&/,$ENV{QUERY_STRING});
my $name = $nv_pairs{name};
my $method = $nv_pairs{method};

#Raw Data is stored in the POST Parameter POSTDATA
my $pdf = $CGI->param('POSTDATA');

print "Content-Type: application/pdf\r\n";
print "Content-Length: " . length($pdf) . "\r\n";
print "Content-Disposition :$method\n\n";
print $pdf;

The problem is that I want to actually render what a pdf will look like. I can save that binary code and open it manually in Adobe Reader and it renders properly.

I would like for it to render in the browser, but I don't know how to get it to.

Currently the output (what the browser displays), looks like this:

Content-Type: application/pdf
Content-Length: 432785
Content-disposition:inline; filename="test.pdf"

%PDF-1.5
1 0 obj
<</Type /Pages
/Kids [3 0 R 5 0 R]
/Count 2>>
endobj
3 0 obj
<</Type /Page
/Parent 1 0 R
/MediaBox [0 0 612.00 792.00]
/Resources 2 0 R

This is only part of the displayed file, but I hope this helps. I don't want the code to display, I want it to look graphical. If I download this file, and change the extension to .pdf, it works perfectly.

tshepang
  • 12,111
  • 21
  • 91
  • 136
Cam
  • 988
  • 1
  • 12
  • 25
  • So you want to render what the pdf would look like without rendering the pdf? Can't you just provide hyperlink to the pdf itself? – Ilion Jan 13 '12 at 23:09
  • I'm not sure what you mean. I have the code that creates the pdf, but I want it to render (look like an image). I hope that makes sense. – Cam Jan 14 '12 at 05:47
  • It's the "look like an image" part that is confusing me. Wouldn't it be best to have the browser simply display the pdf? To me, what you're saying sounds like you want the pdf to be displayed as a jpg or something. – Ilion Jan 14 '12 at 07:15
  • I do want it to display the pdf. Please see my edit. – Cam Jan 14 '12 at 14:09
  • Why do you parse QUERY_STRING manually? – Alexandr Ciornii Jan 14 '12 at 15:27
  • I'm not sure what that means. I just am learning PERL. – Cam Jan 14 '12 at 15:28
  • QUERY_STRING is not Perl-related. Why you use such complex method for filling %nv_pairs? – Alexandr Ciornii Jan 14 '12 at 17:24
  • Since you're already using the CGI module, it should parse them for you. You can access them using the [$query->param()](http://search.cpan.org/~markstos/CGI.pm-3.59/lib/CGI.pm#FETCHING_A_LIST_OF_KEYWORDS_FROM_THE_QUERY:) method. – Ilion Jan 16 '12 at 06:40

2 Answers2

0

You need to add in the following HTTP header

print "Content-Transfer-Encoding: binary\n";

The following is working for me to read a pdf file and display it:

use strict;
use warnings;

my $file = "discover.pdf"; # a pdf I happen to have
my $pdf;

open (my $fh, $file);
binmode $fh; # set the file handle to binary mode
while (<$fh>){ $pdf .= $_; } # read it all into a string;
close ($fh);

showPdf($pdf); # call the display function

sub showPdf {

        my $pdf = shift;
        my $file = shift || "new.pdf"; # if no name is given use this
        my $method = shift || "Content-disposition:inline; filename='$file'"; # default method
        my $size = length($pdf);

        print "Content-Type: application/pdf\n";
        print "Content-Length: $size\n";
        print "$method\n";
        print "Content-Transfer-Encoding: binary\n\n"; # blank line to separate headers

        print $pdf;

}

The same function can be added to the original code and should work like this:

#!C:\Perl\bin\perl.exe
##
BEGIN { $ENV{PATH} = ''; delete @ENV{ 'IFS', 'CDPATH', 'ENV', 'BASH_ENV'}; }
use strict;
use warnings;
no warnings qw (redefine closure);
use CGI;
my $CGI = new CGI();

#name=generated.pdf&method=inline these are passed via the URL and are in the environmental variable QUERY_STRING
my %nv_pairs = map{my @tmp = split(/=/,$_);$tmp[0] => $tmp[1] }split(/&/,$ENV{QUERY_STRING});
my $name = $nv_pairs{name};
my $method = $nv_pairs{method};

#Raw Data is stored in the POST Parameter POSTDATA
my $pdf = $CGI->param('POSTDATA');

showPdf($pdf, $name, $method);

sub showPdf {

    my $pdf = shift;
    my $file = shift || "new.pdf"; # if no name is given use this
    my $method = shift || "Content-disposition:inline; filename='$file'"; # default method
    my $size = length($pdf);

    print "Content-Type: application/pdf\n";
    print "Content-Length: $size\n";
    print "$method\n";
    print "Content-Transfer-Encoding: binary\n\n"; # blank line to separate headers

    print $pdf;

}
Ilion
  • 6,772
  • 3
  • 24
  • 47
  • In your header lines change the \r\n to \n\n. If that doesn't fix it can you update what you're currently getting as the output? – Ilion Jan 14 '12 at 18:06
  • That still doesn't work. The output is still as I've described in my edit above. – Cam Jan 14 '12 at 18:16
  • I edited my solution with a proof of concept script I was using. Is it printing the new headers I suggested? – Ilion Jan 14 '12 at 19:03
  • No. Now I'm getting errors. I'm not trying to open a file as you are with "discover.pdf" I'm passing in a param as bytes. – Cam Jan 14 '12 at 20:04
  • What errors are you getting? You said it works if you write out to a file. Do you pass it in the same way and simply print `$CGI->param('POSTDATA');` into a file? If so there shouldn't be a difference between that and reading a file into a scalar then outputting it like I am. – Ilion Jan 14 '12 at 20:12
  • -1 for lack of binmode and cargo-culting and misspelling the Accept-Range header – daxim Jan 15 '12 at 11:24
  • Edited to add binmode. The header is Accept-Ranges, not Accept-Range. (Unless I'm missing some other spelling error?) Cargo-culting? – Ilion Jan 15 '12 at 11:48
  • I confused it with Accept-Range, consider the claim about misspelling wrong/moot. – More critique: You call it HTML header, this is wrong. Your program as written is not able to satisfy range requests, it goes against protocol to add the header indicating acceptance. [You do not understand what it does, but you add it anyway.](http://enwp.org/Cargo_cult_programming) In your program, the response body is not separated from the header as required by protocol. Most importantly: your program takes a file from the file system which is totally not what the OP wants. – daxim Jan 15 '12 at 12:58
  • Edited to properly refer to HTTP headers, added in the proper blank line to separate headers from body, separated the important bits out into a function and showed how to use it in the original code. Relatively new to these parts so I hope everything is now in order. It's funny how when I double checked things so many people insisted on the Accept-Ranges header. I couldn't recall having used it last time I wrote one of these but was bowing to the common knowledge. Guess I should have trusted my instincts. – Ilion Jan 16 '12 at 06:35
  • @Cameron also I'm replying in the other thread about the query string issue that was mentioned. I didn't correct it here because I wasn't able to test that. – Ilion Jan 16 '12 at 06:36
0

I do not have the Flash app that creates the PDF in the request body, but I verified it against the output of a static resource with the same response headers. Content-Disposition is the crucial one. This was tested in Konqueror with the Okular KPart and works, I fully expect other browsers/plug-in combinations to also work.

#!/usr/bin/perl -T
# ↑↑↑↑↑
# on Windows you can just write …
#!perl -T
# … instead, using the Unix shebang however does no harm
use strict;
use warnings FATAL => 'all';
use CGI qw();
use IO::File qw();

# delete @ENV{qw(BASH_ENV CDPATH ENV IFS PATH)};
# ↑↑↑↑↑
# Cleaning path is required for taint-checked programs
# that want to run other programs. It does not affect anything here,
# so I commented it out.

my $c = CGI->new;

# untaint data coming from outside
my ($name) = defined $c->url_param('name') ?
    $c->url_param('name') =~ /\A ([a-zA-Z_-]{1,40}\.pdf) \z/msx : ();
my ($method) = defined $c->url_param('method') ?
    $c->url_param('method') =~ /\A (attachment|inline) \z/msx : ();
die 'invalid parameters' unless $name or $method;

# FIXME: untaint blindly because I don't know how to validate PDF
my ($pdf) = $c->param('POSTDATA') =~ /(.*)/msx;

STDOUT->binmode(':raw');
STDOUT->print($c->header(
    -Content_Type        => 'application/pdf',
    -Content_Length      => length($pdf),
    -Content_Disposition => qq($method; filename="$name"),
));
STDOUT->print($pdf);

Be aware that you are mixing GET and POST parameters. Learn how to write secure CGI programs.

Community
  • 1
  • 1
daxim
  • 39,270
  • 4
  • 65
  • 132
  • Still, this just prints out plain text and not a rendered pdf. Any thoughts on what else I could be doing wrong? – Cam Jan 15 '12 at 15:20
  • With which browsers/plug-ins did you test this? – daxim Jan 16 '12 at 01:23
  • Which plug-in? All this cannot possibly work if you do not have a PDF plug-in set up in your browser(s) to intercept the MIME type. – daxim Jan 16 '12 at 16:14
  • I don't know if I have any plugins that are PDF specific. Can you explain please? – Cam Jan 16 '12 at 16:44
  • You can only expect to render a PDF in the browser when you have a capable plug-in! I think this whole affair starts to turn out as a wild goose chase. Try going to the location `about:plugins` for information (copy address, paste into URL bar). – daxim Jan 16 '12 at 17:27
  • I have Chrome PDF Viewer for Chrome. That's it. – Cam Jan 17 '12 at 15:47
  • Just installed Chrome 16.0.912.75 with the built-in PDF viewer enabled, program works for me. The mistake is somewhere else. – daxim Jan 17 '12 at 16:09