Azure Search fails to index WADLogsTable

Question

I am creating a log crawler to combine logs from all our Azure applications. Some of them store logs in SLAB format and some of them simply use the Azure Diagnostics Tracer. Since version 2.6, Azure Diagnostics tracer is actually creating two Timestamp columns in the Azure Table "WADLogsTable". The explanation for this behavior from Microsoft is the following:

"https://azure.microsoft.com/en-us/documentation/articles/vs-azure-tools-diagnostics-for-cloud-services-and-virtual-machines/"

TIMESTAMP is PreciseTimeStamp rounded down to the upload frequency boundary. So, if your upload frequency is 5 minutes and the event time 00:17:12, TIMESTAMP will be 00:15:00.

Timestamp is the timestamp at which the entity was created in the Azure table.

Sadly Azure Search currently only supports case insensitive column mapping, so when I create a simple datasource, index and indexer, I get an exception about multiple columns existing in the datasource with the same name (Timestamp).

I tried not to use Timestamp and instead use the PreciseTimeStamp, but then I get a different exception:

"Specified cast is not valid.Couldn't store <8/18/2016 12:10:00 AM> in Timestamp Column. Expected type is DateTimeOffset."

I assume this is because the current Azure Table datasource insists on keeping track of Timestamp column for change tracking behind the scenes.

The behavior is the same if I programmatically create all the objects, or use the "Import Data" functionality on the portal.

Does anyone have any other strategy or approach to overcome this issue?

We are happily indexing our SLAB tables btw, it's just the WAD failing now.

Did you try to store a real timestamp rather than DateTime? http://stackoverflow.com/a/9814280/1384539 — Thiago Custodio, Aug 22 '16 at 01:41
That cast error message shows up only if the Timestamp column is not selected in the index schema at all, that is I assume because Azure Search adds that column automatically behind the scenes. The real issue is that I can't add the Timestamp column to schema at all, because of the second column called TIMESTAMP — Murat Boduroglu, Aug 22 '16 at 01:44
Hi Murat, we'll investigate the "specified cast is not valid" issue. I'll respond here once we have more details. Thanks! — Eugene Shvets, Aug 23 '16 at 22:29
Hi Murat, we've fixed the 'Specified cast is not valid' issue; that fix will be rolled in the course of next week. With this fix, you'll be able to programmatically create the indexer as long as the target index does not use any of the duplicate columns (i.e., exactly what you tried to do using PreciseTimestamp). Note that Data Import wizard will not support this scenario for now. Once the fix is deployed, I'll follow up here. It would be great if you could email me at eugenesh at the usual Microsoft domain so we have a more efficient communication channel. Thanks! — Eugene Shvets, Aug 24 '16 at 03:12
Hi Eugene, thanks for the fast response and looking into this issue, I will ping you by email.. If I start using PreciseTimestamp, how are you guys going to track changes? Still going to use Timestamp column in the background or should we have to implement it manually? — Murat Boduroglu, Aug 24 '16 at 04:02

Azure Search fails to index WADLogsTable

0 Answers0