1

I have a server running on AWS Linux. The application uses poppler-utils.

The server is CI integrated. So all the necessary dependencies are installed before the application is deployed. One of the dependencies is poppler-utils.

Till now I had been installing it using $ yum install poppler-utils. Recently I realized that the version on Amazon Linux repo hasn't been updated for ages (0.26.5 vs latest on ubuntu is 20.08 - 6 years of version difference).

I can of course build and install (using make and make install on source) on a single machine. For CI/CD purposes I need something that is fast to install and deploy (yum packages work great for this).

How can I get ready to deploy recent version of poppler-utils?

Few Ideas I have explored:

  1. Try installing from another repo (non amazon linux) that has a recent version of poppler-utils.
  2. Build a rpm file myself. Never built one myself so the task looks daunting.

Looking for some direction on which path to pursue.

silent_grave
  • 628
  • 1
  • 7
  • 20
  • 1
    What about creating custom AMI with compiled and setup new version of poppler-utils?\ – Marcin Mar 17 '21 at 11:36
  • Hi @Marcin, that's a wonderful idea. Let me try this! – silent_grave Mar 17 '21 at 11:52
  • Let me know how it will go. If it will work, I can provide an answer with extra info. – Marcin Mar 17 '21 at 11:54
  • 1
    @Marcin AMI based solution worked for me. Please provide an answer so I can accept it. Also, building poppler wasn't straightforward since it requires building cmake and installing some other packages first. Maybe zethw can provide another answer with steps to that. – silent_grave Mar 23 '21 at 09:18
  • Thanks. Answer provided. You could make new question for poppler build, which you could even answer yourself for future reference. – Marcin Mar 23 '21 at 09:23

3 Answers3

1

I spent about three days on this issue. Turns out that the Amazon Linux OS is essentially CentOS7 and it looks like 0.26.5 (Sep 2014) is the last version available for CentOS7, 0.66.0 (June 2018) for CentOS8, and 20.11.0 (Nov 2020) for CentOS8 Stream according to https://pkgs.org/download/poppler-utils 21.03 is the latest (March 2021)

I tried, unsuccessfully, to build my own versions of the libraries through a bunch of http://www.linuxfromscratch.org articles and a lot of prereqs. The biggest issue that I've been finding is that the version that I build is not being used and the version that was installed via yum is, so there are a bunch of version dependencies that I've been trying to address that are not being recognized. I don't want to mess with yum and screw everything else up.

So I've gone down the path of Docker...one of those things that I know that I should have learned but never got around to it. It is the perfect solution. I built my docker off Installing Poppler utils of version 0.82 in docker with the versions updated to the most recent.

Once you build the Dockerfile, create an AMI so you have a starting point and don't have to wait for everything to download and build again.

zethw
  • 323
  • 1
  • 12
1

Based on the comments.

The solution proposed was to build custom AMI:

You can launch an instance from an existing AMI, customize the instance (for example, install software on the instance), and then save this updated configuration as a custom AMI. Instances launched from this new custom AMI include the customizations that you made when you created the AMI.

Thus the AMI was creating with current version of poppler-utils, which ensures that any instance launched from the AMI will have up-to-date poppler.

Marcin
  • 215,873
  • 14
  • 235
  • 294
0

Thanks a lot @marcin and @zethw for the answers.

I went with AMI + build poppler from scratch approach. High-level steps are:

  1. Create an Instance suitable for creating AMI. In my case, I was using elastic beanstalk for my application. Hence, the instance had to be created from elasticbeanstalk AMI

  2. Connect to that AMI and build poppler. You'll notice you need to do a lot of library dance on this one. But in the end, ensure $ pdftoppm --help returns proper output (as a way to test).

  3. Create an AMI from the instance you've been using in Step 2.

It sounds straightforward but you'll have to deal with a few issues:

  • Getting a recent version of cmake as the latest version of poppler requires recent cmake. You'll need to build that as yum's amazon repo doesn't have the recent version.
  • While building poppler, the cmake command with prompting you for missing libraries. This may vary from Amazon Linux 1 to 2 and your setup.
  • Don't forget to ensure poppler utils (e.g. pdftoppm) is in path at the end.

Word of Advice

I would say @zethw's answer is more sustainable in long term. Or else consider moving out of Amazon Linux if you have the luxury.

silent_grave
  • 628
  • 1
  • 7
  • 20
  • 1
    You were able to get the latest version of Poppler to build on Amazon Linux 2? Impressive! FYI: I am taking this one step further and working with pdf2image https://github.com/Belval/pdf2image to further utilize the Docker approach, by submitting a PR to add a command line way to call convert_from_path. It is a wrapper around pdftoppm and pdftocairo that will run multithreaded. Then all you would need to do is pull the latest (or a specific version) Poppler Docker image from a repo, start the container, and call `docker exec -it` with this script. Very clean. – zethw Mar 24 '21 at 19:31
  • @zethw That'd be very nice. Probably the best approach. – silent_grave Mar 25 '21 at 10:49