39

The problem how to automatically deserialize/unmarshal record from CSV file into Go struct.

For example, I have

type Test struct {
  Name string
  Surname string
  Age int
}

And CSV file contains records

John;Smith;42
Piter;Abel;50

Is there an easy way to unmarshal those records into struct except by using "encoding/csv" package for reading record and then doing something like

record, _ := reader.Read()
test := Test{record[0],record[1],atoi(record[2])}
Valentyn Shybanov
  • 19,331
  • 7
  • 66
  • 59
  • 3
    Nope, the pattern you have is the way to go. (I bet `record, _ := Read()` was just to get concise example code here, but do handle the error in your real code, or it'll bite you when the program someday misbehaves and you don't know why.) – twotwotwo Dec 25 '13 at 04:29
  • Ehh... I hoped there is some package that will use reflection like xml/json unmarshalling. And of course ignoring error just to minimize source of example by skipping non-relevant code. – Valentyn Shybanov Dec 25 '13 at 04:37
  • I wonder why they didn't write such a package. Might be fun to write one yourself. – Tyler Dec 25 '13 at 04:39
  • I know this is many years later, I am just wondering what would be a good use case to do this? I feel like Go is strongly typed and in most cases you would know the schema of the csv up-front, so testing the type of each field in each row would be slow/redundant. Perhaps a tool that is designed to infer and suggest schema to users? If it is to avoid boilerplate or hardcoding type conversions perhaps separating the schema to a struct with a method to convert it is a solution? – Davos Oct 29 '19 at 06:20

5 Answers5

41

There is gocarina/gocsv which handles custom struct in the same way encoding/json does. You can also write custom marshaller and unmarshaller for specific types.

Example:

type Client struct {
    Id      string `csv:"client_id"` // .csv column headers
    Name    string `csv:"client_name"`
    Age     string `csv:"client_age"`
}

func main() {
    in, err := os.Open("clients.csv")
    if err != nil {
        panic(err)
    }
    defer in.Close()

    clients := []*Client{}

    if err := gocsv.UnmarshalFile(in, &clients); err != nil {
        panic(err)
    }
    for _, client := range clients {
        fmt.Println("Hello, ", client.Name)
    }
}
rustyx
  • 80,671
  • 25
  • 200
  • 267
pikanezi
  • 1,078
  • 13
  • 12
  • Beware this lib supports only `*os.File`. Watch your steps if you're handling form data from HTTP (i.e. `multipart.File`). – vahdet Nov 03 '21 at 11:48
15

Seems I've done with automatic marshaling of CSV records into structs (limited to string and int). Hope this would be useful.

Here is a link to playground: http://play.golang.org/p/kwc32A5mJf

func Unmarshal(reader *csv.Reader, v interface{}) error {
    record, err := reader.Read()
    if err != nil {
        return err
    }
    s := reflect.ValueOf(v).Elem()
    if s.NumField() != len(record) {
        return &FieldMismatch{s.NumField(), len(record)}
    }
    for i := 0; i < s.NumField(); i++ {
        f := s.Field(i)
        switch f.Type().String() {
        case "string":
            f.SetString(record[i])
        case "int":
            ival, err := strconv.ParseInt(record[i], 10, 0)
            if err != nil {
                return err
            }
            f.SetInt(ival)
        default:
            return &UnsupportedType{f.Type().String()}
        }
    }
    return nil
}

I'll try to create github package is someone needs this implementation.

Valentyn Shybanov
  • 19,331
  • 7
  • 66
  • 59
  • Parse a CSV is a bit more complex than that you can have quoted fields or multiline, as part of a own project i made a parser for CSV to map: https://github.com/mcuadros/collector/blob/master/src/format/csv.go If you are interesting we can join efforts and maybe a csv parser library. – mcuadros Dec 26 '13 at 12:28
  • @mcuadros, but I am using a standart package encoding/csv for parsing, so all this issues with quoted fields are done using standard package. Topic of my question was about automatic unmarshaling into static struct (not dynamic map). Why did you made own package for parsing CSV? – Valentyn Shybanov Dec 26 '13 at 14:44
  • Oh i missed this point. BTW the standard package is too slow this implementation is 3-4x more faster and if you are using another reader or other input you must create a stringreader. Another cvs parser implementation can be found at https://github.com/gwenn/yacr – mcuadros Dec 26 '13 at 15:00
  • Actually good advice! I can introduce interface with one method `Read() string[]` that will just read one line from CSV. In this case I can switch between different implementations of readers quite easy! – Valentyn Shybanov Dec 26 '13 at 15:06
  • @ValentynShybanov you can replace `*csv.Reader` with an `io.Reader` interface instead. – basebandit Jul 28 '20 at 13:06
1

You could bake your own. Perhaps something like this:

package main

import (
    "fmt"
    "strconv"
    "strings"
)

type Test struct {
    Name    string
    Surname string
    Age     int
}

func (t Test) String() string {
    return fmt.Sprintf("%s;%s;%d", t.Name, t.Surname, t.Age)
}

func (t *Test) Parse(in string) {
    tmp := strings.Split(in, ";")
    t.Name = tmp[0]
    t.Surname = tmp[1]
    t.Age, _ = strconv.Atoi(tmp[2])
}

func main() {

    john := Test{"John", "Smith", 42}
    fmt.Printf("john:%v\n", john)

    johnString := john.String()
    fmt.Printf("johnString:%s\n", johnString)

    var rebornJohn Test
    rebornJohn.Parse(johnString)
    fmt.Printf("rebornJohn:%v\n", rebornJohn)

}
Mike Kinney
  • 87
  • 1
  • 3
  • Yes, as stated in my question, I've wrote it using manual marshaling so I was looking for some way of automatic marshaling same as encoding/xml do. But to implement it there is need to use reflection... – Valentyn Shybanov Dec 25 '13 at 12:32
1

Using csvutil it is possible to give column header see example.

In your case, this could be :

package main

import (
    "encoding/csv"
    "fmt"
    "io"
    "os"

    "github.com/jszwec/csvutil"
)

type Test struct {
    Name    string
    Surname string
    Age     int
}

func main() {
    csv_file, _ := os.Open("test.csv")
    reader := csv.NewReader(csv_file)
    reader.Comma = ';'

    userHeader, _ := csvutil.Header(Test{}, "csv")
    dec, _ := csvutil.NewDecoder(reader, userHeader...)

    var users []Test
    for {
        var u Test
        if err := dec.Decode(&u); err == io.EOF {
            break
        }
        users = append(users, u)
    }

    fmt.Println(users)
}
mpromonet
  • 11,326
  • 43
  • 62
  • 91
0

A simple way to solve the problem is to use JSON as an intermediate representation.

Once you've done this, you have a variety of tools at your disposal.

You can...

  • Unmarshal directly to your type (if it's all strings)
  • Unmarshal to a map[string]interface{} and then make any necessary type conversions
  • Unmarshal -> convert types -> remarshal JSON -> unmarshal to your type

Here's a simple generic marshalling function that enables this flow...

pairToJSON := func(header, record []string) string {
    raw := ""
    for j, v := range record {
        if j != 0 {
            raw += ",\n"
        }
        raw += "\"" + header[j] + "\":\"" + v + "\""
    }
    raw = "{\n" + raw + "\n}"
    return raw
}

The above is compatible with the []string data produced by the standard csv library.

Brent Bradburn
  • 51,587
  • 17
  • 154
  • 173