Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MQTT messages no longer received #118

Open
EnvironCMO opened this issue Jul 31, 2024 · 4 comments
Open

MQTT messages no longer received #118

EnvironCMO opened this issue Jul 31, 2024 · 4 comments

Comments

@EnvironCMO
Copy link

We are currently using the aws-ocpp-gateway architecture to connect charging wallboxes to AWS
deployed using Github Actions and AWS CDK.

The connection has been successful, and we have been receiving messages in IoT core until last week. After a deployment that does not affect the OCPP gateway stack, we are no longer receiving messages in IoT Core. Redeployment of all stacks was the only solution, but this was temporary and on the next deployment, we stopped receiving messages again.

We have run manual tests on Postman and we can connect to the websocket endpoint but are consistently disconnected after 10 seconds. In the ECS task logs, we also see the reported timeout.

@galinsky
Copy link
Contributor

If I understand you have reproduced the same behavior twice: 1/ full stack deployment and everything runs fine for an extended duration, and 2/ another deployment causes the gateway to start failing with a websocket timeout. Can you provide some details of the "deployment that does not affect the OCPP gateway stack" -- it appears there may be some connection between this release and the websocket disconnect.

@EnvironCMO
Copy link
Author

The OCPP gateway stack only depends on the baseline stack. The baseline stack has a caching table in it, which is not currently used. The OCPP stack has custom resources which are built new on each deploy but the only thing that is retrieved is a static endpoint (i.e. it is the same for every deployment and we have checked that it is really the same).

it was btw, not a full-stack deployment. It was:

  1. a deployment of just the OCPP stack and the baseline stack in a separate account - we receive messages
  2. Deletion and redeployment of the OCPP stack only- fails, no messages
  3. Redeployment of all stacks including the baseline stack - we receive messages
  4. Next deployment no change to OCPP or baseline stack - fails, we don’t receive messages

@galinsky
Copy link
Contributor

There are two custom resources: 1/ IoT endpoint which is purely a read function, so should have no impact, 2/ generation of IoT certs used by the gateway to connect to AWS IoT core (the section of code is below), and 3/ attach policies to the cert. I suspect that it is this 2nd custom resource that causing the issue. Although it does not have an explicit onUpdate action, it may be inactivating the cert preventing the gateway from connecting to IoT core. You may want to add more logging (output messages) to this stack to see if that's in fact the issue.

    const iotCreateKeysAndCertificateCr = new cr.AwsCustomResource(this, 'KeysCerts', {
      policy: cr.AwsCustomResourcePolicy.fromStatements([
        new iam.PolicyStatement({
          effect: iam.Effect.ALLOW,
          resources: cr.AwsCustomResourcePolicy.ANY_RESOURCE,
          actions: ['iot:CreateKeysAndCertificate', 'iot:UpdateCertificate'],
        }),
      ]),
      logRetention: logs.RetentionDays.ONE_DAY,
      onCreate: {
        service: 'Iot',
        action: 'createKeysAndCertificate',
        parameters: {
          setAsActive: true,
        },
        physicalResourceId: cr.PhysicalResourceId.fromResponse('certificateId'),
      },
      onDelete: {
        service: 'Iot',
        action: 'updateCertificate',
        parameters: {
          certificateId: new cr.PhysicalResourceIdReference(),
          newStatus: 'INACTIVE',
        },
      },
    });

@EnvironCMO
Copy link
Author

Thanks for your reply, yes, we do see that on deploy, regardless of whether there have been changes to the stack itself, there is an update made to custom resources.

OcppGatewayStack: creating CloudFormation changeset...
OcppGatewayStack | 0/9 | 9:26:14 AM | UPDATE_IN_PROGRESS   | Custom::AWS                                 | IOTDescribeEndpoint/Resource/Default (IOTDescribeEndpoint77221C07) 
OcppGatewayStack | 0/9 | 9:26:14 AM | UPDATE_IN_PROGRESS   | Custom::AWS                                 | UpdateEventConfigurations/Resource/Default (UpdateEventConfigurationsABD0680D) 
OcppGatewayStack | 0/9 | 9:26:14 AM | UPDATE_IN_PROGRESS   | Custom::AWS                                 | AttachPolicyIOT/Resource/Default (AttachPolicyIOT552A8C9A) 
OcppGatewayStack | 0/9 | 9:26:08 AM | UPDATE_IN_PROGRESS   | AWS::CloudFormation::Stack                  | OcppGatewayStack User Initiated
OcppGatewayStack | 1/9 | 9:26:17 AM | UPDATE_COMPLETE      | Custom::AWS                                 | UpdateEventConfigurations/Resource/Default (UpdateEventConfigurationsABD0680D) 
OcppGatewayStack | 2/9 | 9:26:17 AM | UPDATE_COMPLETE      | Custom::AWS                                 | AttachPolicyIOT/Resource/Default (AttachPolicyIOT552A8C9A) 
OcppGatewayStack | 2/9 | 9:26:21 AM | UPDATE_IN_PROGRESS   | Custom::AWS                                 | IOTDescribeEndpoint/Resource/Default (IOTDescribeEndpoint77221C07) Requested update required the provider to create a new physical resource
OcppGatewayStack | 3/9 | 9:26:22 AM | UPDATE_COMPLETE      | Custom::AWS                                 | IOTDescribeEndpoint/Resource/Default (IOTDescribeEndpoint77221C07) 
OcppGatewayStack | 3/9 | 9:26:25 AM | UPDATE_IN_PROGRESS   | AWS::ECS::TaskDefinition                    | Task (Task79114B6B) Requested update requires the creation of a new physical resource; hence creating one.
OcppGatewayStack | 3/9 | 9:26:27 AM | UPDATE_IN_PROGRESS   | AWS::ECS::TaskDefinition                    | Task (Task79114B6B) Resource creation Initiated
OcppGatewayStack | 4/9 | 9:26:27 AM | UPDATE_COMPLETE      | AWS::ECS::TaskDefinition                    | Task (Task79114B6B) 
OcppGatewayStack | 4/9 | 9:26:29 AM | UPDATE_IN_PROGRESS   | AWS::ECS::Service                           | Service/Service (ServiceD69D759B) 
4/9 Currently in progress: OcppGatewayStack, ServiceD69D759B
OcppGatewayStack | 3/9 | 9:32:26 AM | DELETE_COMPLETE      | AWS::ECS::TaskDefinition                    | Task (Task79114B6B) 
OcppGatewayStack | 3/9 | 9:32:26 AM | DELETE_IN_PROGRESS   | AWS::CloudFormation::CustomResource         | IOTDescribeEndpoint/Resource/Default (IOTDescribeEndpoint77221C07) 
OcppGatewayStack | 4/9 | 9:32:21 AM | UPDATE_COMPLETE      | AWS::ECS::Service                           | Service/Service (ServiceD69D759B) 
OcppGatewayStack | 5/9 | 9:32:23 AM | UPDATE_COMPLETE_CLEA | AWS::CloudFormation::Stack                  | OcppGatewayStack 
OcppGatewayStack | 5/9 | 9:32:25 AM | DELETE_IN_PROGRESS   | AWS::ECS::TaskDefinition                    | Task (Task79114B6B) 
OcppGatewayStack | 4/9 | 9:32:28 AM | DELETE_COMPLETE      | AWS::CloudFormation::CustomResource         | IOTDescribeEndpoint/Resource/Default (IOTDescribeEndpoint77221C07) 
OcppGatewayStack | 5/9 | 9:32:28 AM | UPDATE_COMPLETE      | AWS::CloudFormation::Stack                  | OcppGatewayStack 

If you have any suggestions on fixes or logging that could help further let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants