Writing the SquishTo Paperclip Processor

Recently at StickyAlbums, we had need of a way to shrink photos, specifically JPEGs, down to at or below a specified file size. We're using Paperclip to handle image processing inside of our Rails applications, so a Paperclip processor seemed like the ideal solution.

Before deciding to reinvent the wheel, I went on a search for a pre-existing processor that would o the trick. In the end, I found several for applying different types of compression to images, but none that determined or set that level of compression based on the file size of the processed image.

So, I set about writing one.

The Initial Setup

To do this, I first began by setting up an empty class for the processor.

module Paperclip
  class SquishTo < Processor
  end
end

Next, inside of this class, I had to define the initialize function, which, when called, would be passed a few variables from Paperclip.

module Paperclip
  class SquishTo < Processor
    def initialize(file, options = {}, attachment = nil)
      super
    end
  end
end

As you can see, our initialize function is passed the file, an options object, and an attachment - we set these to nil values by default, but Paperclip will pass them in. I also call super to inherit funcationality from the parent class.

Inside this initialize function, I next wanted to set up a few other important variables, like the file object, its basename, the amount of "squish" we're going to apply (how small the final output needs to be), etc.

# Make file and attachment globally accessible to the module.
@file           = file
@attachment     = attachment

# and set the basename (also globally accessible) for ease of use.
@basename       = File.basename(@file.path)

# Now determine the amount of squish by reading the options
# object, or default to 500kb.

squish          = @attachment.options[:squish_to] || 500

# Multiply this by 1024 to get the max file size in bytes.
@threshhold     = 1024 * squish

So far, so good. We can now read our file when the class is initialized and set some default values around it.

Some Handy Helper Methods

Now, before we write our main compression logic, we need to create a few small helper functions that the processor will uses. One will be to ensure that the file we're compressing ns a JPEG file, and the other two will simply be convenience functions to make the code a little easier to read.

So lets's tart with our is_jpeg? function. The basic idea here is to open the image file for reading and check the file header info for a JPEG mimetype. We do this rather than relying on file extensions because depending on when the processor is being ran, it may or may not have a file extension.

For example, if the processor is ran first, we're dealing with the real file. If it's ran after other processors, we may be working with a temp file that has no file extension. Since we want SquishTo to work in either case, we'll not rely on the file extension.

def is_jpeg?(file)
  jpg = Regexp.new("\xff\xd8\xff\xe0\x00\x10JFIF".force_encoding("binary"))
  jpg2 = Regexp.new("\xff\xd8\xff\xe1(.*){2}Exif".force_encoding("binary"))
  case IO.read(file, 10)
  when /^#{jpg}/
    true
  when /^#{jpg2}/
    true
  else
    false
  end
end

Next, let's create our two convenience methods, to set the fromFile and toFile paths.

def fromFile
  File.expand_path(@file.path)
end

def toFile(destination)
  File.expand_path(destination.path)
end

Compressing the Image

With those out of the way, it was time to set up our main Make function. This is the function that's called to perform the actual image processing. So let's take a quick look at an empty one, with some exception handling included.

def make
  begin
  rescue
    raise Paperclip::Error, "Could not convert #{@basename}" if @whiny
  end
end

As you can see, there's no code in our begin block yet, so if we called this function right now, it wouldn't do anything. However, we are setting up an exception handler to simply report a conversion failure if everything goes sideways.

So let's start adding conversion code inside of our begin statement.

def make
begin if is_jpeg?(@file.path) else @file end rescue raise Paperclip::Error, "Could not convert #{@basename}" if @whiny end end

In the code above, we use our is_jpeg? method defined earlier to ensure that we're only running the compression on a JPEG image. Otherwise, we simply return the @file object without doing any further processing.

So what happens if we are working with a JPEG image? Here's where it gets interesting.

size = File.size(@file.path)

if size > @threshhold
  temp_file = Tempfile.new(@basename)
  temp_file.binmode

  quality = 98

  while size > @threshhold && quality > 0
    convert("-strip -interlace Plane -quality #{quality}% #{fromFile} #{toFile(temp_file)}")
    size = File.size(temp_file)
    quality = quality - 2
  end

  temp_file
else
  @file
end

As you can see, we start by determining the file size of the original image. If it's below our threshhold, no further processing is necessary and we simply return it.

However, if it's bigger, we create a new binary temp file and, starting with an image quality of 98, we start running a loop to convert the image, lowering the quality by 2 each time until the file size of the converted image is at or below our @thresshold value.

Once we've made it to the other side of the loop, we return the newly created and converted temp_file, instead of the oroginal @file object.

Putting It All Together

Here's a look at everything together in one place.

module Paperclip
  class SquishTo < Processor
    def initialize(file, options = {}, attachment = nil)
      super
      @file           = file
      @attachment     = attachment
      @basename       = File.basename(@file.path)

      squish          = @attachment.options[:squish_to] || 
      500
      @threshhold     = 1024 * squish
    end

    def make
      begin
        if is_jpeg?(@file.path)
          size = File.size(@file.path)

          if size > @threshhold
            temp_file = Tempfile.new(@basename)
            temp_file.binmode

            quality = 98

            while size > @threshhold && quality > 0
              convert("-strip -interlace Plane -quality #{quality}% #{fromFile} #{toFile(temp_file)}")
              size = File.size(temp_file)
              quality = quality - 2
            end

            temp_file
          else
            @file
          end
        else
          @file
        end
      rescue
        raise Paperclip::Error, "Could not convert #{@basename}" if @whiny
      end
    end

    def fromFile
      File.expand_path(@file.path)
    end

    def toFile(destination)
      File.expand_path(destination.path)
    end

    def is_jpeg?(file)
      jpg = Regexp.new("\xff\xd8\xff\xe0\x00\x10JFIF".force_encoding("binary"))
      jpg2 = Regexp.new("\xff\xd8\xff\xe1(.*){2}Exif".force_encoding("binary"))
      case IO.read(file, 10)
      when /^#{jpg}/
        true
      when /^#{jpg2}/
        true
      else
        false
      end
    end
  end
end

Installation

To install SquishTo, create a paperclip_procesors folder inside of your Rails application lib directory. Then simply place squish_to.rb inside of it.

Usage

To use SquishTo, you'll need to add it as a processor inside of your model, like so:

has_attached_file :image, :styles => {
  res2048:  "2048x2048>",
  res960:   "960x960>"
},
processors: [:thumbnail, :squish_to]

validates_attachment_content_type :image, content_type: [ 'image/jpg', 'image/jpeg', "image/png", "image/gif"]
validates_attachment_size :image, less_than: 8.megabytes
validates_attachment_presence :image

Note that when SquishTo runs in the list of Paperclip processors is important. If you're doing any resizing of the image files physical dimensions, SquishTo should run after the default :thumbnail processor for best results.

Customizing

If you'd like to compress to a final output size other than 500kb, create a paperclip.rb file inside of your Rails Applications' config/initializers directory and set the SquishTo output size like so:

Paperclip::Attachment.default_options[:squish_to] = 700

Please note that the :squish_to value is always specified in kilobytes.

Use Cases and Limitiations

SquishTo, while useful, is definitely a processor with a distinct set of limitations.

Compressing images based on filesize, not quality of appearance can lead to highly compressed images, if the source files are too large or the desired final output too small. Also, since SquishTo reaches its final destination size by continually re-reprocessing the image until it fits, the process can be quite time and resource consuming for very large uploads.

We've found in our tests that SquishTo works best when image uploads are 3mb in size or less and the final output is between 400-700kb.