Elixir : Upload zip to S3

How to upload a zip to S3


Today we'll see how to upload a bunch of files stored in a zip file using Elixir and Phoenix and a little bit of Erlang too.

First of all let's see the process, the user send us a multipart zip file using a form, we receive and store it. Then we need to unzip the file (in our case we chose to do it in /tmp). Then we use the Elixir Arc library to store it in S3. Actually we're using Arc Ecto to do it as we want to store the file reference in our database to access it later.

We want to make it transactional as we don't want to have just part of the zip stored in our database, but all entries. We'll introduce Ecto.Multi that allows us to do this.

First of all let's have a look to our Media model that will store the media in our PostgreSQL.

defmodule MyApp.Media do
  use Ecto.Schema
  use Arc.Ecto.Schema
  import Ecto.Changeset

  @primary_key {:id, Ecto.UUID, autogenerate: true}
  @derive {Phoenix.Param, key: :id}

  schema "model_medias" do
    field(:name, MyApp.Uploader.Media.Type)

    timestamps()
  end

  @doc false
  def changeset(media, attrs) do
    media
    |> cast(attrs, [:name])
    |> cast_attachments(attrs, [:name])
    |> validate_required([:name])
  end
end

At the beginning we've defined an UUID primary key but that is not mandatory and a name field defined as a MyApp.Uploader.Media. This field is just a simple string with the file name and a timestamp that will be stored in database. Now let's have a look to our media uploader module.

defmodule MyApp.Uploader.Media do
  use Arc.Definition
  use Arc.Ecto.Definition

  # To add a thumbnail version:
  @versions [:original, :thumb]

  # Whitelist file extensions:
  def validate({file, _}) do
    ~w(.jpg .jpeg .gif .png .pdf) |> Enum.member?(Path.extname(file.file_name))
  end

  # Define a thumbnail transformation:
  def transform(:thumb, {file, _scope}) do
    {:convert, "-strip -thumbnail 250x250^ -format png", :png}
  end

  # Override the persisted filenames:
  def filename(version, {file, _scope}) do
    file_name = Path.basename(file.file_name, Path.extname(file.file_name))
    "#{version}_#{file_name}"
  end

  def filename(version, _) do
    version
  end

  # Override the storage directory:
  def storage_dir(_version, {_file, _scope}) do
    "uploads/medias/"
  end
end

You can see that this is an Arc definition, to grab all the details I strongly recommend you to have a look to the official documentation. This script define that we want to store the file in the "uploads/medias/" folder and we want to create a thumbnail of 250x250 when uploading.

Now that we have our model we need to load the zip file and unzip it, to do that we'll use the :zip erlang library. You first need to “open” the archive by unzipping it (you can do it on hard drive or in memory) and then access each file to store it and then close the handler by calling zip_close. Here is how you do it :

def import(%Plug.Upload{} = zipfile) do
    path = to_charlist(zipfile.path)
    path_name = to_charlist("/tmp")

    with {:ok, handle} <- :zip.zip_open(path, [{:cwd, path_name}]),
          {:ok, file_names} = :zip.zip_get(handle) do
      try do
        # remove files beginning with '.'
        filter_hidden_files(file_names)
        # transform file to %Plug.Upload{}
        |> to_plug_upload()
        # create multi with all inserts
        |> to_multi(Multi.new())
        # run the transaction
        |> Repo.transaction()
      after
        :zip.zip_close(handle)
      end
    end
  end
end

As you can see we chose to unzip in “/tmp” folder. After getting the list of file paths, we remove the hidden ones (beginning by '.') to generate the transaction for data storage. I'll pass the filter_hidden_files function that is trivial to code and show you the to_plug_upload function that is a litle trickier.

defp to_plug_upload(_, uploads \\ [])

defp to_plug_upload([], uploads) do
  uploads
end

defp to_plug_upload([file | tail], uploads) do
  upload = %Plug.Upload{
    content_type: MIME.from_path(file),
    filename: Path.basename(file),
    path: to_string(file)
  }

  to_plug_upload(tail, uploads ++ [upload])
end


You see that the recursion is done on each file contained in the zip, we use it to generate a Plug.Upload structure by specifying the MIME type thanks to the MIME module, the filename and the file path. Remember that to use the :zip library we had to transform our string to a char list (because of Erlang compatibility), we need to do the same in the other side by changing our file path to a string.

Then there is the transaction creation through the to_multi implementation.


defp to_multi([], multi) do
  multi
end

defp to_multi([file | tail], multi) do
  attrs = %{name: file}
  m = %Media{} |> Media.changeset(attrs)
  to_multi(tail, multi |> Multi.insert(file.filename, m))
end

It's straightforward too as you see we recurse over the file list and add an insert in our transaction for each file.

I hope the quick example will make you're work easier if you're coding in Elixir or make you wanting to try it out.

Oh, and if you find the possible bug in this very basic implementation you can put it as a comment.

Most seen