Seeding the Database from a CSV file in Phoenix

When you generate a Phoenix project, there is a priv/repo/seeds.exs file that you can use to seed the database. Let’s see how to seed it from a CSV file.

We’ll be working with the transactions.csv file from Mint, and the project name is Minty. This assumes phoenix.gen.html has been used to create a model and the resources line has been added to web/router.ex.

$ mix phoenix.gen.html Transaction transactions date:string description:string original_description:string amount:string transaction_type:string category:string account_name:string labels:string notes:string

First, consult the documentation for the CSV library and add the dependency to mix.exs:

  defp deps do
    [ ...
     {:csv, "~> 1.2.3"}
    ]
  end

Then in priv/repo/seeds.exs we’ll use the CSV library to generate a Map for each row of the CSV file with keys that EXACTLY match our Ecto model.

# Script for populating the database. You can run it as:
#
#     mix run priv/repo/seeds.exs
#
# Inside the script, you can read and write to any of your
# repositories directly:
#
#     Minty.Repo.insert!(%Minty.SomeModel{})
#
# We recommend using the bang functions (`insert!`, `update!`
# and so on) as they will fail if something goes wrong.

alias Minty.Transaction
alias Minty.Repo

defmodule Minty.Seeds do

  def store_it(row) do
    changeset = Transaction.changeset(%Transaction{}, row)
    Repo.insert!(changeset)
  end

end

File.stream!("/Users/wsmoak/Downloads/transactions.csv")
  |> Stream.drop(1)
  |> CSV.decode(headers: [:date, :description, :original_description, :amount, :transaction_type, :category, :account_name, :labels, :notes])
  |> Enum.each(&Minty.Seeds.store_it/1)

Note that we are dropping the first line of the CSV file. It contains headers, but we need to define different ones that match the Ecto model to make insertion easier.

As the comments indicate, you can run this with mix run priv/repo/seeds.exs.

This works, but it assumes everything is a string. To match up the types we need to do a bit more work.

Date

The date column in the Mint.com CSV file is unfortunately in M[M]/DD/YYYY format. Additionally, the days are zero-padded, but not the months. (?!!) Ecto.Date can work with several formats, but that is not one of them. Let’s flip this around to be YYYY-MM-DD instead.

  def fix_date(%{date: <<m0,"/",d1,d0,"/",y3,y2,y1,y0>>} = row) do
    date = <<y3,y2,y1,y0,"-","0",m0,"-",d1,d0>>
    Map.update!(row,:date,fn _ -> date end)
  end

  def fix_date(%{date: <<m1,m0,"/",d1,d0,"/",y3,y2,y1,y0>>} = row) do
    date = <<y3,y2,y1,y0,"-",m1,m0,"-",d1,d0>>
    Map.update!(row,:date,fn _ -> date end)
  end

This pattern matches on the one- and two-digit months and then rearranges the bits with -’s in between, also zero-padding the one-digit months.

An earlier version of this used String.split and String.rjust, but I think I like it better with the pattern matching.

Now the migration can have add :date, :date and the model can have field :date, Ecto.Date.

Amount

All of the amounts in the CSV are positive numbers, and there is a separate transaction_type field that contains either “debit” or “credit”. Mint doesn’t really do double-entry bookeeping so I have no idea why they use this concept. Let’s keep income as positive numbers and change all the expenses (“debits”) to negative.

We’re still working with Strings at this point, so all we need to do is prepend a dash to indicate that it’s negative.

  def fix_amount(%{transaction_type: "debit"} = row) do
    Map.update!(row,:amount,&("-"<>&1))
  end

  def fix_amount(%{transaction_type: "credit"} = row) do
    row
  end

This pattern matches on a Map containing a key of transaction_type, and either updates the value under the :amount key, (or not.)

Now the amount field can be :float in both the migration and the model.

Re-load

Drop the database, re-create it, and run the (edited) migration again:

$ mix ecto.drop
$ mix ecto.create
$ mix ecto.migrate
$ mix run "priv/repo/seeds.exs"

Now you should have data with a proper date field and positive/negative numbers so that math will work.

Query

Now that the data is loaded, let’s try some queries. I have this in priv/repo/queries.exs

defmodule Minty.Queries do
  import Ecto.Query

  alias Minty.Repo
  alias Minty.Transaction

  def bignum do
    Repo.all(
      from txn in Transaction,
      where: txn.amount > 1000
    )
  end

  def summary do
    Repo.all(
      from txn in Transaction,
      group_by: txn.category,
      select: [txn.category, sum(txn.amount)]
    )
  end
end

Play with it in iex -S mix:

> c("priv/repo/queries.exs")
# large transactions
> Minty.Queries.bignum

# summarized by category, sorted by category name
> Minty.Queries.summary |> Enum.sort |> Enum.each(&(IO.inspect &1))

# summarized by category, sorted by amount
> Minty.Queries.summary |> Enum.sort(&(Enum.at(&1,1)>Enum.at(&2,1))) |> Enum.each(&(IO.puts("#{Enum.at(&1,0)} #{Float.to_string(Enum.at(&1,1),[decimals: 2])}")))

Conclusion

We’ve seen how to seed the database from a CSV file in a Phoenix project including date and float type fields, and some simple queries.

The code for this example is available at https://github.com/wsmoak/minty and is MIT licensed.

Date

Amount

Re-load

Query

Conclusion

References