4

I'm using an external json API that's inconsistent in the way it handles missing values. Sometimes json values show up as empty strings and other times as null. For example...

Case1: datedec and curr are both empty strings.

{
    "symbol": "XYZ",
    "dateex": "2020-09-01",
    "datedec": "",
    "amount": "1.25",
    "curr": "",
    "freq": "annual"
}

Case2: datedec is null. curr is populated.

{
    "symbol": "XYZ",
    "dateex": "2020-09-01",
    "datedec": null,
    "amount": "1.25",
    "curr": "USD",
    "freq": "annual"
}

Here is the struct I'm using to represent a dividend:

type Dividend struct {
    symbol   string `json:"symbol"`
    dateex   string `json:"dateex"`
    datedec  string `json:"datedec"`
    amount   string `json:"amount"`
    curr     string `json:"curr"`
    freq     string `json:"freq"`
}

The problem I'm having is how to insert either an empty string or null, into the database as NULL. I know I could use an omitempty json tag, but then how would I write a function to handle values I don't know will be missing? For example, Here is my current function to insert a dividend into postgresql using the jackc/pgx package:

func InsertDividend(d Dividend) error {
    sql := `INSERT INTO dividends 
    (symbol, dateex, datedec, amount, curr, freq)
    VALUES ($1, $2, $3, $4, $5, $6)`
    conn, err := pgx.Connect(ctx, "DATABASE_URL")
    // handle error 
    defer conn.Close(ctx)
    tx, err := conn.Begin()
    // handle error
    defer tx.Rollback(ctx)
    _, err = tx.Exec(ctx, sql, d.symbol, d.dateex, d.datedec, d.amount, d.curr, d.freq)
    // handle error
    }
    err = tx.Commit(ctx)
    // handle error
    return nil
}

If a value (e.g. datedec or curr) is missing, then this function will error. From this post Golang Insert NULL into sql instead of empty string I saw how to solve Case1. But is there a more general way to handle both cases (null or empty string)?

I've been looking through the database/sql & jackc/pgx documentation but I have yet to find anything. I think the sql.NullString has potential but I'm not sure how I should be doing it.

Any suggestions will be appreciated. Thanks!

7rhvnn
  • 175
  • 1
  • 7
  • 1
    In the SO post you linked one answer shows a function to convert from `""` to `sql.NullString`. You can wrap any variables you want to convert in that function: `_, err = tx.Exec(ctx, sql, d.symbol, d.dateex, NewNullString(d.datedec), d.amount, NewNullString(d.curr), d.freq)`. Won't that work in your case? – phonaputer Sep 07 '20 at 23:56
  • @7rhvnn Note that your `Dividend` type will never work with the json package, or any other package that needs to modify its fields, because all your fields are [unexported](https://golang.org/ref/spec#Exported_identifiers). Note also that `omitempty` is significant only during json *marshaling / encoding*, it has no relevance whatsoever in your current scenario *unmarshaling / decoding* and *db persistance*. – mkopriva Sep 08 '20 at 04:06
  • 1
    @7rhvnn an alternative to the answer provided by Brits is to use [`NULLIF`](https://www.postgresql.org/docs/current/functions-conditional.html#FUNCTIONS-NULLIF) in the sql string. e.g. `INSERT INTO dividends (... datedec, ...) VALUES (... NULLIF($3, ''), ...)` – mkopriva Sep 08 '20 at 04:13
  • 2
    If the webservice you're consuming is broken, fix your data while consuming that webservice. Don't let dirty data enter your database. In Go you can quite easily hook into the unmarshalling process and utilise types such as json.RawMessage and/or implement a custom unmarshal function (https://golang.org/pkg/encoding/json/#example_RawMessage_unmarshal). This way you can fix the data types of the values and have a canonical value object to work with after that process. – Dynom Sep 08 '20 at 07:01
  • @phonaputer That solves the problem. Prior to posting I wrapped that function around all the values EXCEPT for datedec, so that was an oversight on my part. – 7rhvnn Sep 08 '20 at 15:56
  • @mkopriva my actual code uses exported fields, I trying to reduce the confusion by keeping the json and struct fields the same for readability. I had no idea `omitempty` only works on marshalling / encoding. Thanks for that tip! The NULLIF seems like the best solution. BUT will it work for case 2 (when the json value is `null`)? Given go go decodes the json value as `nill`, will `NULLIF` correctly insert `nil` as `NULL` into the database? – 7rhvnn Sep 08 '20 at 16:18
  • @7rhvnn `NULLIF` returns it's first argument value if it doesn't match it's second argument, if first is `NULL` and second is `''` (empty string), then it will return first, which is `NULL`. So yes, `NULLIF($3, '')` will result in `NULL` in both cases, when `$3` is `NULL` and when `$3` is `''` (empty string). – mkopriva Sep 08 '20 at 16:22
  • @7rhvnn and remember to use the opposite `COALESCE`, i.e. `COALESCE(column_name, '')`, when you're retrieving such a column (one that can be `NULL`) and scanning it into a non-pointer, non-custom-scanner, string. If you don't `row(s).Scan` will result in an error whenever the column is `NULL`. – mkopriva Sep 08 '20 at 16:27

2 Answers2

2

There are a number of ways you can represent NULL when writing to the database. sql.NullString is an option as is using a pointer (nil = null); the choice really comes down to what you find easer to understand. Rus Cox commented:

There's no effective difference. We thought people might want to use NullString because it is so common and perhaps expresses the intent more clearly than *string. But either will work.

I suspect that using pointers will be the simplest approach in your situation. For example the following will probably meet your needs:

type Dividend struct {
    Symbol  string  `json:"symbol"`
    Dateex  string  `json:"dateex"`
    Datedec *string `json:"datedec"`
    Amount  string  `json:"amount"`
    Curr    *string `json:"curr"`
    Freq    string  `json:"freq"`
}

func unmarshal(in[]byte, div *Dividend) {
    err := json.Unmarshal(in, div)
    if err != nil {
        panic(err)
    }
    // The below is not necessary unless if you want to ensure that blanks
    // and missing values are both written to the database as NULL...
    if div.Datedec != nil && len(*div.Datedec) == 0 {
        div.Datedec = nil
    }
    if div.Curr != nil && len(*div.Curr) == 0 {
        div.Curr = nil
    }
}

Try it in the playground.

You can use the Dividend struct in the same way as you are now when writing to the database; the SQL driver will write the nil as a NULL.

Brits
  • 14,829
  • 2
  • 18
  • 31
  • 1
    How could this be implement using sql.NullString? Would I also have to write a custom Unmarshal function or overwrite the func (*NullString) Scan or func (*NullString) Value function? – 7rhvnn Sep 08 '20 at 17:00
  • 1
    Yes - see the answers to [this question](https://stackoverflow.com/questions/33072172/how-can-i-work-with-sql-null-values-and-json-in-a-good-way) - as this is fairly common there are packages like [null](https://gopkg.in/guregu/null.v4) that will do this for you. – Brits Sep 08 '20 at 21:15
  • well that is awful...Not your answer, just this mechanism, is there a way to zero the value? Like empty string? – Matteo Feb 25 '21 at 12:54
  • 1
    @Madeo In SQL Null and blanks are different things and in JSON undefined and blanks are different things. The Go standard library provides a way of handling most cases and third party libraries (e.g. [null/zero](https://pkg.go.dev/github.com/guregu/null#readme-zero-package)) simplify handling less common requirements. If you have a specific issue that is not already addressed it's probably best to ask it in a new question. – Brits Feb 25 '21 at 20:01
  • 1
    I have ended up using `COALESCE` – Matteo Feb 26 '21 at 00:53
  • String pointer? Really? – TheRealChx101 Apr 04 '23 at 01:56
  • @TheRealChx101 yes, string pointers - your point is? (it's a [common](https://pkg.go.dev/github.com/jackc/pgx/v5@v5.3.1/pgtype#hdr-Null_Values) approach and I mentioned alternatives). – Brits Apr 04 '23 at 05:18
  • @Brits I wasn't making any point. I only asked a question which you answered. What is YOUR point? – TheRealChx101 Apr 04 '23 at 15:08
1

you can also use pgtypes and get the SQL Driver value from any pgtype using the Value() func:

https://github.com/jackc/pgtype

https://github.com/jackc/pgtype/blob/master/text.go

type Dividend struct {
    symbol   pgtype.Text `json:"symbol"`
    dateex   pgtype.Text `json:"dateex"`
    datedec  pgtype.Text `json:"datedec"`
    amount   pgtype.Text `json:"amount"`
    curr     pgtype.Text `json:"curr"`
    freq     pgtype.Text `json:"freq"`
}

func InsertDividend(d Dividend) error {
    // --> get SQL values from d
    var err error
    symbol, err := d.symbol.Value() // see https://github.com/jackc/pgtype/blob/4db2a33562c6d2d38da9dbe9b8e29f2d4487cc5b/text.go#L174
    if err != nil {
        return err
    }
    dateex, err := d.dateex.Value()
    if err != nil {
        return err
    }
    // ...

    sql := `INSERT INTO dividends 
    (symbol, dateex, datedec, amount, curr, freq)
    VALUES ($1, $2, $3, $4, $5, $6)`
    conn, err := pgx.Connect(ctx, "DATABASE_URL")
    defer conn.Close(ctx)
    tx, err := conn.Begin()
    defer tx.Rollback(ctx)
    // --> exec your query using the SQL values your get earlier
    _, err = tx.Exec(ctx, sql, symbol, dateex, datedec, amount, curr, freq)
    // handle error
    }
    err = tx.Commit(ctx)
    // handle error
    return nil
}
Big_Boulard
  • 799
  • 1
  • 13
  • 28