Refactored PDF & ZIP export features for images. Add reCAPTCHA or Shib...
Refactored PDF & ZIP export features for images. Add reCAPTCHA or Shib verification to thwart bots. Use derived_image JPGs from filesystem as source instead of image server. Replace rubyzip with zip_tricks for streaming. Replace prawn with hexapdf CLI for better RAM usage. Closes DDR-2252.
This is a complete refactor of the PDF & ZIP export features for DDR image items in order to accommodate exporting 100+ page items. Summary of changes:
For either ZIP or PDF
- use the new
derived_image
JPG derivatives stored on the filesystem as the source files instead of hitting the image server w/http requests - removes deprecated code for exports relying on the image server (e.g.,
absolute_urls_for_img_component_jpgs
) - adds reCAPTCHA or Shib verification to prevent bots from triggering exports
- adds an animated loading state to the link that indicates to a user when the export is processing
- moves logic out of the
CatalogController
and into a newExportFilesController
- updates
ddr-core
to1.6.5
in order to useSolrDocument.derived_image_file_path
- adds
SolrDocument.derived_image_file_paths
-- an array of component JPG paths for an item
ZIP exports
- replaces
RubyZIP
gem withZipTricks
to enable streaming the ZIP file as it builds
PDF exports
- replaces
Prawn
withHexaPDF
gem to create PDFs; uses HexaPDF's command-line utilityimage2pdf
- this HexaPDF approach consumes only 1/3 of the system memory as Prawn did during a PDF export
- now writes the PDF to a Tempfile (and removes it afterward)
- no longer downscales the PDF to 1000px (this was reducing the page size but not the file size); PDF pages are now the same pixel dimensions as the source image
- removes
fastimage
gem - note that the
prawn
gem has not been removed since it is presently still used for exporting AV caption files (WebVTT) as PDF
Requirements
- Update
ddr-admin
to at least1.13.4
to captureSolrDocument.derived_image_file_path
during indexing. - Ensure image components have
derived_image
JPG files; runrake ddr:derived_images
to retroactively create them. - Reindex image components
Add these two environment variables; see https://duke.app.box.com/notes/830416801166 for more info:
RECAPTCHA_SITE_KEY
RECAPTCHA_SECRET_KEY
Screenshots
Options render in the Download
menu:
Prompts for verification:
reCAPTCHA -- sometimes a check is sufficient:
reCAPTCHA -- sometimes have to select images e.g.:
Passed reCAPTCHA:
Failed reCAPTCHA:
Export in Progress:
Edited by Sean Aery