ECS container HTTP reset peer

ECS container HTTP reset peer injects HTTP reset on the service whose port is specified using the TARGET_SERVICE_PORT environment variable.

It stops the outgoing HTTP requests by resetting the TCP connection for the requests.
It determines the application's resilience to a lossy (or flaky) HTTP connection.
This experiment induces chaos within a container and depends on an EC2 instance. Typically, these are prefixed with "ECS container" and involve direct interaction with the EC2 instances hosting the ECS containers.

ECS Container HTTP Reset Peer

Use cases

ECS container HTTP reset peer:

Simulates premature connection loss (firewall issues or other issues) between microservices (verify connection timeout).
Simulates connection resets due to resource limitations on the server side like out of memory server (or process killed or overload on the server due to a high amount of traffic).

Prerequisites

Kubernetes >= 1.17
ECS container metadata is enabled (disabled by default). To enable it, go to container metadata. If your task is currently running, you may need to restart it to get the metadata directory.
ECS cluster running with the desired tasks and containers and familiarity with ECS service update and deployment concepts.
Access to the ECS cluster instances with the necessary permissions to update the start and stop timeouts for containers. Refer to systems manager docs.
Backup and recovery mechanisms in place to handle potential failures during the testing process.
You and the ECS cluster instances have a role with the required AWS access to perform the SSM and ECS operations.
Kubernetes secret with AWS Access Key ID and secret access key credentials in the CHAOS_NAMESPACE. Below is the sample secret file:

apiVersion: v1
kind: Secret
metadata:
  name: cloud-secret
type: Opaque
stringData:
  cloud_config.yml: |-
    # Add the cloud AWS credentials respectively
    [default]
    aws_access_key_id = XXXXXXXXXXXXXXXXXXX
    aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

tip

HCE recommends that you use the same secret name, that is, cloud-secret. Otherwise, you will need to update the AWS_SHARED_CREDENTIALS_FILE environment variable in the fault template with the new secret name and you won't be able to use the default health check probes.

Below is an example AWS policy to execute the fault.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ssm:GetDocument",
                "ssm:DescribeDocument",
                "ssm:GetParameter",
                "ssm:GetParameters",
                "ssm:SendCommand",
                "ssm:CancelCommand",
                "ssm:CreateDocument",
                "ssm:DeleteDocument",
                "ssm:GetCommandInvocation",          
                "ssm:UpdateInstanceInformation",
                "ssm:DescribeInstanceInformation"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2messages:AcknowledgeMessage",
                "ec2messages:DeleteMessage",
                "ec2messages:FailMessage",
                "ec2messages:GetEndpoint",
                "ec2messages:GetMessages",
                "ec2messages:SendReply"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstanceStatus",
                "ec2:DescribeInstances"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

note

You can pass the VM credentials as secrets or as a ChaosEngine environment variable.
The ECS container should be in a healthy state before and after introducing chaos.
Refer to the superset permission or policy to execute all AWS faults.
Refer to the common attributes to tune the common tunables for all the faults.
Refer to AWS named profile for chaos to use a different profile for AWS faults.

Mandatory tunables

Tunable	Description	Notes
REGION	The AWS region ID where the ECS container has been created.	For example, `us-east-1`.
RESET_TIMEOUT	Reset Timeout specifies after how much duration to reset the connection.	Default: 0. For more information, go to reset timeout.
TARGET_SERVICE_PORT	Port of the service to target.	Default: port 80. For more information, go to target service port.

Optional tunables

Tunable	Description	Notes
TOTAL_CHAOS_DURATION	Duration that you specify, through which chaos is injected into the target resource (in seconds).	Default: 30 s. For more information, go to duration of the chaos.
CHAOS_INTERVAL	Time interval between two successive instance terminations (in seconds).	Default: 30 s. For more information, go to chaos interval.
AWS_SHARED_CREDENTIALS_FILE	Provide the path for aws secret credentials.	Defaults to `/tmp/cloud_config.yml`.
SEQUENCE	It defines the sequence of chaos execution for multiple instances.	Default: parallel. Supports serial and parallel. For more information, go to sequence of chaos execution.
RAMP_TIME	Period to wait before and after injection of chaos (in seconds).	For example, 30 s. For more information, go to ramp time.
INSTALL_DEPENDENCY	Select to install dependencies used to run the network chaos. It can be either True or False.	If the dependency already exists, you can turn it off. Defaults to True.
PROXY_PORT	Port where the proxy will be listening to requests.	Defaults to 20000. For more information, go to proxy port.
NETWORK_INTERFACE	Network interface to be used for the proxy.	Default: `eth0`. For more information, go to network interface.

Target service port

Service port that is targeted. Tune it by using the TARGET_SERVICE_PORT environment variable.

The following YAML snippet illustrates the use of this environment variable:

## provide the port of the targeted service
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: ecs-container-http-reset-peer
    spec:
      components:
        env:
        # provide the port of the targeted service
        - name: TARGET_SERVICE_PORT
          value: "80"

Proxy port

Port where the proxy server listens for requests. Tune it by using the PROXY_PORT environment variable.

The following YAML snippet illustrates the use of this environment variable:

# provide the port for proxy server
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: ecs-container-http-reset-peer
    spec:
      components:
        env:
        # provide the port for proxy server
        - name: PROXY_PORT
          value: '8080'
        # provide the port of the targeted service
        - name: TARGET_SERVICE_PORT
          value: "80"

Reset timeout

Reset timeout value added to the HTTP request. Tune it by using the RESET_TIMEOUT environment variable.

The following YAML snippet illustrates the use of this environment variable:

## provide the reset timeout value
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: ecs-container-http-reset-peer
    spec:
      components:
        env:
        # reset timeout specifies after how much duration to reset the connection
        - name: RESET_TIMEOUT #in ms
          value: '2000'
        # provide the port of the targeted service
        - name: TARGET_SERVICE_PORT
          value: "80"

Network interface

Network interface used for the proxy. Tune it by using the NETWORK_INTERFACE environment variable.

The following YAML snippet illustrates the use of this environment variable:

## provide the network interface for proxy
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: ecs-container-http-reset-peer
    spec:
      components:
        env:
        # provide the network interface for proxy
        - name: NETWORK_INTERFACE
          value: "eth0"
        # provide the port of the targeted service
        - name: TARGET_SERVICE_PORT
          value: '80'

Use cases​

Prerequisites​

Mandatory tunables​

Optional tunables​

Target service port​

Proxy port​

Reset timeout​

Network interface​

Use cases

Prerequisites

Mandatory tunables

Optional tunables

Target service port

Proxy port

Reset timeout

Network interface