Skip to content

Refactored PDF & ZIP export features for images. Add reCAPTCHA or Shib...

Sean Aery requested to merge DDR-2252-pdf-zip-export into main

Refactored PDF & ZIP export features for images. Add reCAPTCHA or Shib verification to thwart bots. Use derived_image JPGs from filesystem as source instead of image server. Replace rubyzip with zip_tricks for streaming. Replace prawn with hexapdf CLI for better RAM usage. Closes DDR-2252.

This is a complete refactor of the PDF & ZIP export features for DDR image items in order to accommodate exporting 100+ page items. Summary of changes:

For either ZIP or PDF

  • use the new derived_image JPG derivatives stored on the filesystem as the source files instead of hitting the image server w/http requests
  • removes deprecated code for exports relying on the image server (e.g., absolute_urls_for_img_component_jpgs)
  • adds reCAPTCHA or Shib verification to prevent bots from triggering exports
  • adds an animated loading state to the link that indicates to a user when the export is processing
  • moves logic out of the CatalogController and into a new ExportFilesController
  • updates ddr-core to 1.6.5 in order to use SolrDocument.derived_image_file_path
  • adds SolrDocument.derived_image_file_paths -- an array of component JPG paths for an item

ZIP exports

  • replaces RubyZIP gem with ZipTricks to enable streaming the ZIP file as it builds

PDF exports

  • replaces Prawn with HexaPDF gem to create PDFs; uses HexaPDF's command-line utility image2pdf
  • this HexaPDF approach consumes only 1/3 of the system memory as Prawn did during a PDF export
  • now writes the PDF to a Tempfile (and removes it afterward)
  • no longer downscales the PDF to 1000px (this was reducing the page size but not the file size); PDF pages are now the same pixel dimensions as the source image
  • removes fastimage gem
  • note that the prawn gem has not been removed since it is presently still used for exporting AV caption files (WebVTT) as PDF


  • Update ddr-admin to at least 1.13.4 to capture SolrDocument.derived_image_file_path during indexing.
  • Ensure image components have derived_image JPG files; run rake ddr:derived_images to retroactively create them.
  • Reindex image components

Add these two environment variables; see for more info:



Options render in the Download menu:


Prompts for verification:


reCAPTCHA -- sometimes a check is sufficient:


reCAPTCHA -- sometimes have to select images e.g.:


Passed reCAPTCHA:


Failed reCAPTCHA:


Export in Progress:


Edited by Sean Aery

Merge request reports