-1

This question is in reference to: How to read data from AVRO file using C++ interface?

int main(int argc, char**argv)
{
    std::cout << "AVRO Test\n" << std::endl;

    if (argc < 2)
    {
        std::cerr << BOLD << RED << "ERROR: " << ENDC << "please provide an "
                  << "input file\n" << std::endl;
        return -1;
    }

    avro::DataFileReader<avro::GenericDatum> reader(argv[1]);
    auto dataSchema = reader.dataSchema();

    // Write out data schema in JSON for grins
    std::ofstream output("data_schema.json");
    dataSchema.toJson(output);
    output.close();

    avro::GenericDatum datum(dataSchema);
    while (reader.read(datum)) 
    {
        std::cout << "Type: " << datum.type() << std::endl;
        if (datum.type() == avro::AVRO_RECORD) 
        {
            const avro::GenericRecord& r = datum.value<avro::GenericRecord>();
            std::cout << "Field-count: " << r.fieldCount() << std::endl;

            // TODO: pull out each field
        }
    }

    return 0;
}

I used this code, but keep getting a seg fault at the while loop. I have a very large schema and a large amount of data. Decoding the data piece by piece as the Avro examples gives in its "cpx" example is not practical, I need a generic way of reading. I get the seg fault the 3rd time through (consistently) with no error code returned from the read(). Open to any and all suggestions and ideas about reading large schemas in Avro.

1 Answers1

0

As it turns out there is an open ticket/issue on the Avro page for this exact issue. https://issues.apache.org/jira/browse/AVRO-3194

Drew Dormann
  • 59,987
  • 13
  • 123
  • 180