0

I am trying to create a BQ Table from AVRO file. I am getting this error when i run the BQ load job:

"Error while reading data, error message: The Apache Avro library failed to parse the header with the following error: Unexpected type for default value. Expected long, but found null: null"

The Schema of the AVRO file is:

{
  "type" : "record",
  "name" : "Pair",
  "namespace" : "org.apache.avro.mapred",
  "fields" : [ {
    "name" : "key",
    "type" : "int",
    "doc" : ""
  }, {
    "name" : "value",
    "type" : {
      "type" : "record",
      "name" : "CustomerInventoryOrderItems",
      "namespace" : "com.test.customer.order",
      "fields" : [ {
        "name" : "updated_at",
        "type" : "long"
      }, {
        "name" : "inventory_order_items",
        "type" : {
          "type" : "map",
          "values" : {
            "type" : "array",
            "items" : {
              "type" : "record",
              "name" : "CustomerInventoryOrderItem",
              "fields" : [ {
                "name" : "order_item_id",
                "type" : "int",
                "default" : null
              }, {
                "name" : "updated_at",
                "type" : "long"
              }, {
                "name" : "created_at",
                "type" : "long"
              }, {
                "name" : "product_id",
                "type" : [ "null", "int" ],
                "default" : null
              }, {
                "name" : "type_id",
                "type" : "int",
                "default" : null
              }, {
                "name" : "event_id",
                "type" : [ "null", "int" ],
                "default" : null
              }, {
                "name" : "price",
                "type" : [ "null", "double" ],
                "default" : null
              }, {
                "name" : "tags",
                "type" : [ "null", "string" ],
                "default" : null
              }, {
                "name" : "estimated_ship_date",
                "type" : [ "null", "long" ],
                "default" : null
              } ]
            }
          }
        }
      } ]
    },
    "doc" : "",
    "order" : "ignore"
  } ]
}

I am not sure what is wrong with the schema or anything else, because of which I am unable to load the data.

defcon
  • 123
  • 7

1 Answers1

1

The problem is most likely the fields that have type int but you have null as the default value. For example:

                "name" : "type_id",
                "type" : "int",
                "default" : null

The default should either be changed to be an integer or the type should be changed to be a union that includes null (like many of the other fields).

Scott
  • 1,799
  • 10
  • 11
  • this might be an issue, but the error I am getting points to a long attribute, not an integer. Is there anything else I am missing – defcon Mar 08 '21 at 18:15
  • I don't know what library is throwing the error, but when doing a quick search it seems like it could be from here: https://github.com/confluentinc/avro-cpp-packaging/blob/c76e6ef3cfce20f36e1385cd88e2fa613ba10ae1/impl/Compiler.cc#L158-L159. If that is the right library, then the check for int uses the long type for comparisions: https://github.com/confluentinc/avro-cpp-packaging/blob/c76e6ef3cfce20f36e1385cd88e2fa613ba10ae1/impl/Compiler.cc#L222-L224. Did you try fixing the ints? The error message might say long when really it is checking the ints. – Scott Mar 08 '21 at 20:36