10

TL;DR: How can we configure istio sidecar injection/istio-proxy/envoy-proxy/istio egressgateway to allow long living (>3 hours), possibly idle, TCP connections?

Some details:

We're trying to perform a database migration to PostgreSQL which is being triggered by one application which has Spring Boot + Flyway configured, this migration is expected to last ~3 hours.

Our application is deployed inside our kubernetes cluster, which has configured istio sidecar injection. After exactly one hour of running the migration, the connection is always getting closed.

We're sure it's istio-proxy closing the connection as we attempted the migration from a pod without istio sidecar injection and it was running for longer than one hour, however this is not an option going forward as this may imply some downtime in production which we can't consider.

We suspect this should be configurable in istio proxy setting the parameter idle_timeout - which was implemented here. However this isn't working, or we are not configuring it properly, we're trying to configure this during istio installation by adding --set gateways.istio-ingressgateway.env.ISTIO_META_IDLE_TIMEOUT=5s to our helm template.

Yayotrón
  • 1,759
  • 16
  • 27
  • What is your istio version? – Jakub Sep 11 '20 at 11:23
  • 1
    1.5.0, we both work at EPAM btw :) – Yayotrón Sep 11 '20 at 13:40
  • 1
    That's happening because the idle timeout is defined as the period in which there are no bytes sent or received on either the upstream or downstream connection. If not set, the default idle timeout is 1 hour. There is related [github issue](https://github.com/istio/istio/issues/24387) about that. Could you try to change that in istio-proxy with annotation or envoy filter like mentioned [here](https://github.com/istio/istio/issues/24387#issuecomment-651303969)? Additionally there is [example](https://github.com/istio/istio/issues/25555#issuecomment-659051715) of that configured in operator. – Jakub Sep 11 '20 at 15:06
  • 1
    Thanks for the suggestions. We tried all of them but it's not picking up this timeout setting :( – Yayotrón Sep 14 '20 at 10:41
  • Hi @Yayotrón, have you managed to make it work? Have you tried with the annotation in istio-proxy? The result is the same as [here](https://github.com/istio/istio/issues/24387#issuecomment-651859008)? As far as I see in above github issue there are a few people with the same issue and the same istio version. I would suggest to report this on github as it might be a bug. As far as I checked [here](https://github.com/istio/istio/issues/23727) the command is correct, not sure why it doesn't work. – Jakub Sep 24 '20 at 12:41
  • 1
    Hi @Jakub thanks for following up :) In the end we didn't make it work, tried with the annotation and bunch of other approaches... we decided to implement a workaround to perform this migration without istio-proxy which worked very good. – Yayotrón Sep 24 '20 at 16:18

2 Answers2

7

If you use istio version higher than 1.7 you might try use envoy filter to make it work. There is answer and example on github provided by @ryant1986.

We ran into the same problem on 1.7, but we noticed that the ISTIO_META_IDLE_TIMEOUT setting was only getting picked up on the OUTBOUND side of things, not the INBOUND. By adding an additional filter that applied to the INBOUND side of the request, we were able to successfully increase the timeout (we used 24 hours)

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: listener-timeout-tcp
  namespace: istio-system
spec:
  configPatches:
  - applyTo: NETWORK_FILTER
    match:
      context: SIDECAR_INBOUND
      listener:
        filterChain:
          filter:
            name: envoy.filters.network.tcp_proxy
    patch:
      operation: MERGE
      value:
        name: envoy.filters.network.tcp_proxy
        typed_config:
          '@type': type.googleapis.com/envoy.config.filter.network.tcp_proxy.v2.TcpProxy
          idle_timeout: 24h

We also created a similar filter to apply to the passthrough cluster (so that timeouts still apply to external traffic that we don't have service entries for), since the config wasn't being picked up there either.

Jakub
  • 8,189
  • 1
  • 17
  • 31
2

for ingress gateway, we use env.ISTIO_META_IDLE_TIMEOUT to set the idle-timeout for TCP or HTTP protocol. for sidecar, you can use the similar envoyfilter (listener-timeout-tcp) to configure INBOUND direction or OUTBOUND direction.

HobbyTan
  • 51
  • 4