6

I have an application composed of two ASP.NET Core apps, app A and app B. App A makes HTTP calls to App B, and Application Insights automatically correlates this and shows them as a single request. Great!

However, I'm now moving to a more event-based system design, where app A publishes an event to an Azure Event Grid, and app B is set up with a webhook to listen to that event.

Having made that change, the telemetry correlation is broken and it no longer shows up as a single operation.

I have read this documentation: https://learn.microsoft.com/en-us/azure/azure-monitor/app/correlation which explains the theory around correlation headers - but how can I apply this to the Event Grid and get it to forward the correlation headers on to the subscribing endpoints?

gallivantor
  • 1,091
  • 2
  • 14
  • 21

2 Answers2

1

The Header pass-trough idea for a custom topic in the AEG has been recently (Oct.10th) unplanned.

However, the headers can be passed via the AEG model to the subscribers in the data object of the event message. This mediation can be done, for example, using the Policies in Azure API Management.

UPDATE:

The following documents can help for manual instrumentation of the webhook endpoint handler (subscriber side) using a custom tracking operations:

Track custom operations with Application Insights .Net SDK

Application Insights API for custom events and metrics

Roman Kiss
  • 7,925
  • 1
  • 8
  • 21
  • So if you were to manually pass the headers in the data object, how would you initialize App Insights with the correct Operation Id on the receiving end (assuming the receiving end is an ASP.NET Core app that has already automatically initialised a new AppInsights operation)? – gallivantor Oct 28 '19 at 04:30
  • See my *Update*. Basically, the AEG Pub/Sub eventing model can used same tracking pattern across the boundaries (such as processes, domains, machines, etc.). – Roman Kiss Oct 29 '19 at 15:49
  • @gallivantor did you ever get this working properly? If so could you elaborate on what you did? – Dzejms Feb 10 '20 at 19:57
  • There is a post (https://medium.com/@tsuyoshiushio/correlation-with-activity-with-application-insights-1-overview-753a48a645fb) that goes over how to do this with a code sample here: https://github.com/TsuyoshiUshio/ActivitySpike/blob/master/CorrelationBasic/Function1.cs Thank you Tsuyoshi Ushio! – Dzejms Feb 13 '20 at 21:14
1
  1. Add two correlation properties to all your events:

    public string OperationId { get; set; }
    public string OperationParentId { get; set; }
    
  2. Publisher side: create Dependency and fill up these properties.

    private Microsoft.ApplicationInsights.TelemetryClient _telemetryClient;
    
    async Task Publish<TEventData>(TEventData data)
    {
        var @event = new EventGridEvent
        {
            Id = Guid.NewGuid().ToString(),
            EventTime = DateTime.UtcNow,
            EventType = typeof(TEventData).FullName,
            Data = data
        };  
    
        string operationName = "Publish " + @event.EventType;
    
        // StartOperation is a helper method that initializes the telemetry item
        // and allows correlation of this operation with its parent and children.
        var operation =
             _telemetryClient.StartOperation<DependencyTelemetry>(operationName);
        operation.Telemetry.Type = "EventGrid";
        operation.Telemetry.Data = operationName;
    
        // Ideally, the correlation properties should go in the request headers but
        // with the current implementation of EventGrid we have no other way 
        // as to store them in the event Data.
        data.OperationId = operation.Telemetry.Context.Operation.Id,
        data.OperationParentId = operation.Telemetry.Id,
    
        try
        {
            AzureOperationResponse result = await _client
                .PublishEventsWithHttpMessagesAsync(_topic, new[] { @event });
            result.Response.EnsureSuccessStatusCode();
    
            operation.Telemetry.Success = true;
        }
        catch (Exception ex)
        {
            operation.Telemetry.Success = false;
            _telemetryClient.TrackException(ex);
            throw;
        }
        finally
        {
            _telemetryClient.StopOperation(operation);
        }
    }
    
  3. Consumer side: create Request and restore correlation.

    [FunctionName(nameof(YourEventDataCosumer))]
    void YourEventDataCosumer([EventGridTrigger] EventGridEvent @event)
    {
        var data = (YourEventData)@event.Data;
    
        var operation = _telemetryClient.StartOperation<RequestTelemetry>(
            "Handle " + @event.EventType,
            data.OperationId,
            data.OperationParentId);
        try
        {
            // Do some event processing.
    
            operation.Telemetry.Success = true;
            operation.Telemetry.ResponseCode = "200";
        }
        catch (Exception)
        {
            operation.Telemetry.Success = false;
            operation.Telemetry.ResponseCode = "500";
            throw;
        }
        finally
        {
            _telemetryClient.StopOperation(operation);
        }
    }
    

This works, but not ideal as you need to repeat this code in every consumer. Also, some early log messages (e.g. emitted by constructors of injected services) are still not correlated correctly.

A better approach would be to create a custom EventGridTriggerAttribute (recreate the whole Microsoft.Azure.WebJobs.Extensions.EventGrid extension) and move this code into IAsyncConverter.ConvertAsync().

Monsignor
  • 2,671
  • 1
  • 36
  • 34