Proper way of handling connections to an external Microsoft SQL VM cluster

Question

I have some dotnet core microservices running in my kubernetes cluster (1.19.1), they are all running the istio sidecar proxy (1.9.1), and I am seeing some flaky connection behavior when making calls to the microservice which connects to the external SQL cluster. If I look at the sidecar logs I can see this when the connection failure happens:

istio sidecar log:

[2021-05-26T15:00:04.585Z] "- - -" 0 UF,URX - - "-" 0 0 9909 - "-" "-" "-" "-" "11.11.11.11:1433" PassthroughCluster - 11.11.11.11:1433 100.96.13.10:51662 - -
[2021-05-26T15:00:04.585Z] "- - -" 0 UF,URX - - "-" 0 0 9910 - "-" "-" "-" "-" "22.22.22.22:1433" PassthroughCluster - 22.22.22.22:1433 100.96.13.10:59498 - -
[2021-05-26T15:00:04.491Z] "- - -" 0 UF,URX - - "-" 0 0 10003 - "-" "-" "-" "-" "22.22.22.22:1433" PassthroughCluster - 22.22.22.22:1433 100.96.13.10:59484 - -
[2021-05-26T15:00:04.491Z] "- - -" 0 UF,URX - - "-" 0 0 10003 - "-" "-" "-" "-" "33.33.33.33:1433" PassthroughCluster - 33.33.33.33:1433 100.96.13.10:51648 - -
[2021-05-26T15:00:04.491Z] "- - -" 0 UF,URX - - "-" 0 0 10003 - "-" "-" "-" "-" "44.44.44.44:1433" PassthroughCluster - 44.44.44.44:1433 100.96.13.10:58482 - -
[2021-05-26T15:00:04.585Z] "- - -" 0 UF,URX - - "-" 0 0 10001 - "-" "-" "-" "-" "44.44.44.44:1433" PassthroughCluster - 44.44.44.44:1433 100.96.13.10:58496 - -

app log exception:

Unhandled exception: A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 35 - An internal exception was caught)
System.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 35 - An internal exception was caught)

Note on the SQL cluster: in the app config we are using a DNS name for the availability group listener e.g. ag_listener.mydomain.com to point to the HA SQL cluster.

This is all working in our nonprod with no issues, we are also running istio there, though we are running only a single sql instance in nonprod.

Currently, I made sure to set the outboundTrafficPolicy to ALLOW_ANY but I am still seeing this flaky connection behavior. It doesn't happen all the time but it's just highly inconsistent. It's been a real pain for my team trying to resolve this. Is there a proper method on istio for handling connections to a mssql db cluster with multiple IPs? thank you

addtl note: I have tried the following ServiceEntry without any luck:

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: prod-sql-service-entry
spec:
  addresses:
  - 11.11.11.11/32
  - 22.22.22.22/32
  - 33.33.33.33/32
  - 44.44.44.44/32
  hosts:
  - '*.mydomain.com'
  location: MESH_EXTERNAL
  ports:
  - name: tcp
    number: 1433
    protocol: TCP

Here is the connection string we are using: "DB": "Data Source=ag_listener.mycompany.com;initial catalog=DB;persist security info=True;user id=username;password=secretPassword;MultipleActiveResultSets=True;MultiSubnetFailover=True;Connection Timeout=120;Encrypt=True;TrustServerCertificate=True;ApplicationIntent=ReadWrite;", — Noah Dlugoszewski, May 27 '21 at 14:50
I'm not sure if this is helpful but here is sth I have found: https://learn.microsoft.com/en-us/answers/questions/334126/sql-connection-error-35.html?childToView=350106#answer-350106 — Matt, May 28 '21 at 11:20
I looked at the link you posted Matt, but it seems on our end pooling is set to true by default. — Noah Dlugoszewski, May 28 '21 at 16:51
As a note for any who comes across this, disabling the sidecar completely resolves the issue, but that defeats the purpose of using istio but does show there is a sidecar problem. — Noah Dlugoszewski, May 28 '21 at 20:28

Proper way of handling connections to an external Microsoft SQL VM cluster

0 Answers0