Skip to content
This repository has been archived by the owner on Jun 18, 2019. It is now read-only.

dscout/fragmenter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status Code Climate

Fragmenter

Fragmenter is a library for multipart upload support backed by Redis. Fragmenter handles storing multiple parts of a larger binary and rebuilding it back into the original after all parts have been stored.

Why Fragment?

It alleviates the problems posed by uploading large blocks of data from slow clients, notably mobile apps, by allowing the device to send multiple smaller blocks of data independently. Once all of the smaller blocks have been received they can quickly be rebuilt into the original file on the server.

Think of multipart uploading as blocky streaming. Nginx, Rack and Rails all make it impossible to stream binary uploads. Breaking them into manageable pieces is the simplest workaround.

Fragments are intended to be rather small, anywhere from 10-50k depending on the underlying data size. There is a balance between connection overhead from repeated server calls, being connection error tolerant, and not blocking the server from handling other connections.

Heroku

Due to the way the service queues requests we do not recommend using Fragmenter for apps on Heroku. Fragmenter is designed to make uploads from slow mobile clients easier and fault tolerant. Heroku kills all requests after 30 seconds from the point where the request started, not from the point where it was handed to your application. Fragmenting uploads uses a fair number of additional requests and can cause rampant timeout errors (H12 on Heroku).

That isn't to say that Fragmenter just doesn't work on Heroku, it is just sub-optimal—particularly when compared to a properly configured Nginx proxy.

Requirements

Fragmenter is tested on Ruby 1.9.3, 2.0, and 2.1. However, any ruby implementation with 1.9 syntax will be supported.

Redis 2.0 or greater is required and version 2.6 is recommended.

Installation

Add this to your Gemfile:

gem 'fragmenter'

Configuration

You can configure the following components of Fragmenter:

  • redis - Redis instance to use for IO. Defaults to a new instance connected to localhost.
  • logger - Logger instance to write out to. Defaults to STDOUT at the INFO level.
  • expiration - The number of seconds until fragments will expire. Defaults to 86400, or 1 day.
Fragmenter.configure do |config|
  config.redis      = $redis
  config.logger     = Rails.logger
  config.expiration = 2.days.to_i
end

Using Fragmenter with Rails

However, it is designed to be used from within a Rails controller. Include the provided Fragmenter::Controller module into any controller you wish to have process uploads:

class UploadControler < ApplicationController
  include Fragmenter::Rails::Controller

  private

  def resource
    @resource ||= Avatar.find(:avatar_id)
  end
end

The module adds methods for handling the GET, PUT, and DELETE requests needed for handling fragment uploads. You must define a resource method that returns an object implementing fragmenter. In the example above the resource is an instance of the Avatar model, which could look something like this:

class Avatar < ActiveRecord::Base
  include Fragmenter::Rails::Model

  def rebuild_fragments
    self.avatar = Fragmenter::DummyIO.new(fragmenter.rebuild).tap do |io|
      io.content_type = fragmenter.meta['content_type']
    end

    save!
  end
end

You must provide a concrete rebuild_fragments method that will perform rebuilding, saving, persisting etc. Without overriding rebuild_fragments a Fragmenter::AbstractMethodError will be raised when storage is complete and it attempts to rebuild.

The example above synchronous storage using a mounted CarrierWave style uploader. You may want to perform rebuilding with a background worker instead to keep response times speedy.

After you have configured your routes to map show, update and destroy to the uploads controller:

MyApp::Application.routes.draw do
  resource :avatar do
    resource :upload, only: [:show, :update, :destroy]
  end
end

Then you can start sending PUT requests with successive fragments of data. Each fragment will be stored uniquely to the parent object, an instance of Avatar in this case. For each fragment that is stored the response will be the JSON representation of the fragments along with a 200 OK status code:

curl -i
     -X PUT                    /
     -H 'X-Fragment-Number: 1' /
     -H 'X-Fragment-Total: 2'  /
     --data-binary @blob-1     /
     http://example.com/avatar/1/upload

#=> HTTP/1.1 200 OK
#=> { "content_type": "image/jpeg", "fragments": [1], "total": 2 }

When the final part is uploaded the status code will be 202 Accepted if the fragment is valid and can be rebuilt:

curl -i
     -X PUT                    /
     -H 'X-Fragment-Number: 2' /
     -H 'X-Fragment-Total: 2'  /
     --data-binary @blob-2     /
     http://example.com/avatar/1/upload

#=> HTTP/1.1 202 Accepted
#=> { "content_type": "image/jpeg", "fragments": [1,2], "total": 2 }

If you need to customize the status codes for partial or complete PUT requests you can override the update_status method within the controller:

private

# Return 201 Created instead of 202 Accepted
def update_status
  uploader.complete? ? 201 : 200
end

Validation

Often you will want to be sure that all of the data is being stored without any bytes missing. A standard way to handle this is by sending a checksum that is verified after transfer. Fragmenter handles checksum matching using a validator that verifies each fragment that is uploaded. Validation is handled for any request where the Content-MD5 header has been sent:

curl -X PUT                                             /
     -H 'Content-MD5: ceba1b1ffc89e99abb54c1f8ab0c4157' /
     -H 'X-Fragment-Number: 1'                          /
     -H 'X-Fragment-Total: 1'                           /
     --data-binary @blob                                /
     http://example.com/avatar/1/upload

Failure to match the checksum will result in a 422 Unprocessable Entity response with an accompanying message and errors:

{ "message": "Upload of part failed.",
  "errors":  [
    "Expected checksum {{expected}} to match {{calculated}}"
  ]
}

As images uploads are a common use-case for fragmented uploading an ImageValidator is included, but not as one of the defaults. You can control which validators are used by overriding the validators method within the controller:

class AvatarUploader < ApplicationController
  ...

  private

  def validators
    super + [ImageValidator, CustomValidator]
  end
end

To add a custom validator you must add it at some point in the validator chain. A validator can be any class that responds to valid?, part?, and provides a list of errors. See the ImageValidator for an example validator that only performs validation when all fragments are complete.

Note that ImageMagick is required for the ImageValidator to work, but it doesn't require RMagick or MiniMagick.