How to upload a zip to S3
Today we'll see how to
upload a bunch of files stored in a zip file using Elixir and Phoenix
and a little bit of Erlang too.
First of all let's see the
process, the user send us a multipart zip file using a form, we
receive and store it. Then we need to unzip the file (in our case we
chose to do it in /tmp). Then we use the Elixir Arc library to store
it in S3. Actually we're using Arc Ecto to do it as we want to store
the file reference in our database to access it later.
We want to make it
transactional as we don't want to have just part of the zip stored in
our database, but all entries. We'll introduce Ecto.Multi that allows
us to do this.
First of all let's have a
look to our Media model that will store the media in our PostgreSQL.
defmodule MyApp.Media do use Ecto.Schema use Arc.Ecto.Schema import Ecto.Changeset @primary_key {:id, Ecto.UUID, autogenerate: true} @derive {Phoenix.Param, key: :id} schema "model_medias" do field(:name, MyApp.Uploader.Media.Type) timestamps() end @doc false def changeset(media, attrs) do media |> cast(attrs, [:name]) |> cast_attachments(attrs, [:name]) |> validate_required([:name]) end end
At the beginning we've
defined an UUID primary key but that is not mandatory and a name
field defined as a MyApp.Uploader.Media. This field is just a simple
string with the file name and a timestamp that will be stored in
database. Now let's have a look to our media uploader module.
defmodule MyApp.Uploader.Media do use Arc.Definition use Arc.Ecto.Definition # To add a thumbnail version: @versions [:original, :thumb] # Whitelist file extensions: def validate({file, _}) do ~w(.jpg .jpeg .gif .png .pdf) |> Enum.member?(Path.extname(file.file_name)) end # Define a thumbnail transformation: def transform(:thumb, {file, _scope}) do {:convert, "-strip -thumbnail 250x250^ -format png", :png} end # Override the persisted filenames: def filename(version, {file, _scope}) do file_name = Path.basename(file.file_name, Path.extname(file.file_name)) "#{version}_#{file_name}" end def filename(version, _) do version end # Override the storage directory: def storage_dir(_version, {_file, _scope}) do "uploads/medias/" end end
You can see that this is
an Arc definition, to grab all the details I strongly recommend you
to have a look to the official documentation. This script define that
we want to store the file in the "uploads/medias/" folder
and we want to create a thumbnail of 250x250 when uploading.
Now that we have our model
we need to load the zip file and unzip it, to do that we'll use the
:zip erlang library. You first need to “open” the archive by
unzipping it (you can do it on hard drive or in memory) and then
access each file to store it and then close the handler by calling
zip_close. Here is how you do it :
def import(%Plug.Upload{} = zipfile) do path = to_charlist(zipfile.path) path_name = to_charlist("/tmp") with {:ok, handle} <- :zip.zip_open(path, [{:cwd, path_name}]), {:ok, file_names} = :zip.zip_get(handle) do try do # remove files beginning with '.' filter_hidden_files(file_names) # transform file to %Plug.Upload{} |> to_plug_upload() # create multi with all inserts |> to_multi( # run the transaction |> Repo.transaction() after :zip.zip_close(handle) end end end end
As you can see we chose to
unzip in “/tmp” folder. After getting the list of file paths, we
remove the hidden ones (beginning by '.') to generate the transaction
for data storage. I'll pass the filter_hidden_files function
that is trivial to code and show you the to_plug_upload function that
is a litle trickier.
defp to_plug_upload(_, uploads \\ []) defp to_plug_upload([], uploads) do uploads end defp to_plug_upload([file | tail], uploads) do upload = %Plug.Upload{ content_type: MIME.from_path(file), filename: Path.basename(file), path: to_string(file) } to_plug_upload(tail, uploads ++ [upload]) end
You see that the recursion
is done on each file contained in the zip, we use it to generate a
Plug.Upload structure by specifying the MIME type thanks to the MIME
module, the filename and the file path. Remember that to use the :zip
library we had to transform our string to a char list (because of
Erlang compatibility), we need to do the same in the other side by
changing our file path to a string.
Then there is the
transaction creation through the to_multi implementation.
defp to_multi([], multi) do multi end defp to_multi([file | tail], multi) do attrs = %{name: file} m = %Media{} |> Media.changeset(attrs) to_multi(tail, multi |> Multi.insert(file.filename, m)) end
It's straightforward too
as you see we recurse over the file list and add an insert in our
transaction for each file.
I hope the quick example
will make you're work easier if you're coding in Elixir or make you
wanting to try it out.
Oh, and if you find the
possible bug in this very basic implementation you can put it as a
