2023-04-13

Spring Boot ECS service cannot connect to DocumentDB cluster

In AWS I have a Spring Boot application running in ECS Fargate in a Docker container, which I deployed using CloudFormation. In the same VPC (with two subnets) I deployed a DocumentDB cluster with a single instance. When run locally with MongoDB (both in-memory and as a Docker image), the Spring Boot application connects fine (both in the IDE and using Docker Compose, respectively).

In CloudFormation I injected the DocumentDB instance as a Spring Boot environment parameter in the ECS container definition:

- Name: SPRING_DATA_MONGODB_HOST
  Value: !GetAtt DbCluster.Endpoint
- Name: SPRING_DATA_MONGODB_PORT
  Value: !GetAtt DbCluster.Port

When I deploy the CloudFormation stack, in the console I can go to the DocumentDB cluster and see that it shows the database cluster host as something like this:

foo-bar-xxxxxxxxxxxx.us-east-1.docdb.amazonaws.com

The password is stored in Secrets Manager and attached to the database; that secret shows the same host name.

I can go to a Cloud9 instance in the same VPC and connect to foo-bar-xxxxxxxxxxxx.us-east-1.docdb.amazonaws.com using mongosh.

When my Spring Boot instance starts up, its logs show:

2023-04-11T23:18:39.610Z INFO 1 --- [ main] org.mongodb.driver.client : MongoClient with metadata {"driver": {"name": "mongo-java-driver|sync|spring-boot", "version": "4.8.2"}, "os": {"type": "Linux", "name": "Linux", "architecture": "amd64", "version": "5.10.173-154.642.amzn2.x86_64"}, "platform": "Java/Eclipse Adoptium/17.0.6+10"} created with settings MongoClientSettings{… clusterSettings={hosts=[foo-bar-xxxxxxxxxxxx.us-east-1.docdb.amazonaws.com:27017], srvServiceName=mongodb, mode=SINGLE, …

Note that the database cluster host Spring Boot is using the same one I connected to from Cloud9.

But eventually the Spring Boot application times out:

org.springframework.dao.DataAccessResourceFailureException: Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=foo-bar-xxxxxxxxxxxx.us-east-1.docdb.amazonaws.com:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message}, caused by {java.net.SocketTimeoutException: Read timed out}}]

I even restarted the ECS task thinking maybe the database hadn't been initialized yet, but with the same result.

In Cloud 9, an nslookup foo-bar-xxxxxxxxxxxx.us-east-1.docdb.amazonaws.com resolves to an IP address within the VPC. The security group (AWS::EC2::SecurityGroup) associated with the ECS service has no egress defined, so it defaults to allowing all egress to anywhere on any port (which I just confirmed via the console).

If Spring Boot, Document DB, and Cloud9 are all running in the same VPC, any idea why Cloud9 can connect just fine to foo-bar-xxxxxxxxxxxx.us-east-1.docdb.amazonaws.com, but the Spring Boot ECS Fargate instance cannot? Where should I be looking?

I should also mention I'm using Service Connect, where I've set up a Cloud Map namespace example.internal and enabled Service Connect for the ECS service with the a port name of my-service. I understand that Service Connect sets up some sort of sidecar "proxy" container running alongside my task container. Could this Service Connect proxy somehow be blocking my service outgoing requests? Do I need to do something further to allow the service to make connections to the database cluster?



No comments:

Post a Comment