AWS CloudWatch + AWS Lambda
Automate Cloud Operations by Integrating AWS CloudWatch with AWS Lambda
Turn real-time monitoring alerts into automated actions without writing custom infrastructure code.


Why integrate AWS CloudWatch and AWS Lambda?
AWS CloudWatch and AWS Lambda are two of the most useful services in the AWS ecosystem, and together they're the engine behind most event-driven cloud operations. CloudWatch continuously monitors metrics, logs, and alarms across your AWS infrastructure, while Lambda executes serverless functions in response to virtually any trigger. By integrating these two services through tray.ai, teams can build automated workflows that detect anomalies, respond to threshold breaches, and coordinate cross-service remediation — all without manual intervention.
Automate & integrate AWS CloudWatch & AWS Lambda
Use case
Automated Infrastructure Remediation on CloudWatch Alarms
When a CloudWatch alarm detects a critical threshold breach — CPU utilization exceeding 90%, memory running low — tray.ai can automatically trigger a Lambda function to fix the issue. That might mean restarting unhealthy EC2 instances, scaling out an Auto Scaling group, or flushing a clogged queue. The entire remediation loop runs in seconds without anyone getting paged.
Use case
Real-Time Log Anomaly Detection and Alerting
CloudWatch Logs Insights can be configured to detect anomalous patterns — repeated 5xx error codes, authentication failures, unexpected null responses — and tray.ai can route those findings to a Lambda function that enriches the event with additional context before sending notifications. The right team gets a fully contextualized alert rather than a raw log dump. Engineering, security, and operations teams all receive actionable, relevant information immediately.
Use case
Scheduled Lambda Invocations Triggered by CloudWatch Events
CloudWatch Events (now Amazon EventBridge) supports cron-style scheduling that can invoke Lambda functions at precise intervals for recurring operational tasks. Using tray.ai, teams can manage and extend these scheduled workflows — running nightly database cleanup functions, generating periodic cost reports, or invoking data transformation pipelines. tray.ai adds orchestration logic on top of native scheduling, so you can include conditional branching and downstream notifications.
Use case
Cross-Service Incident Escalation Workflows
When CloudWatch detects a service degradation event, tray.ai can trigger a Lambda function that simultaneously creates a PagerDuty incident, posts a message to a designated Slack channel, and opens a ticket in Jira — all from a single alarm. No incident goes unnoticed and the right stakeholders are notified through their preferred channels. The workflow can also include conditional logic to escalate differently based on alarm severity.
Use case
Serverless Cost Monitoring and Budget Enforcement
CloudWatch metrics expose detailed Lambda invocation counts, execution durations, throttle rates, and error rates that, when analyzed together, reveal cost anomalies and runaway functions. tray.ai can build workflows that monitor these metrics against budget thresholds and automatically trigger a Lambda function to disable or throttle a specific function if it exceeds its allocated cost envelope. Finance and engineering teams get automated reports when budget guardrails kick in.
Use case
Automated Log Archiving and Compliance Reporting
CloudWatch Log Groups accumulate large volumes of operational and application logs that must be retained for compliance purposes. tray.ai can schedule a Lambda function to export specific log groups to S3 on a defined cadence, apply lifecycle policies, and trigger downstream notifications to compliance tools. This replaces fragile custom scripts with a managed, observable automation that works with your existing data retention policies.
Use case
Dynamic Auto-Scaling Triggered by Custom CloudWatch Metrics
Beyond native AWS auto-scaling policies, teams often need custom scaling logic based on application-specific metrics published to CloudWatch — queue depth, active user sessions, business transaction volume. tray.ai can monitor these custom metrics and invoke Lambda functions that execute complex scaling decisions, modify capacity reservations, or interact with third-party infrastructure tools. You get far more sophisticated scaling strategies than native policies allow.
Get started with AWS CloudWatch & AWS Lambda integration today
AWS CloudWatch & AWS Lambda Challenges
What challenges are there when working with AWS CloudWatch & AWS Lambda and how will using Tray.ai help?
Challenge
Managing Event Payload Complexity Between CloudWatch and Lambda
CloudWatch alarm events, log subscription filter events, and scheduled events all have distinct JSON payload structures that need careful mapping before they can be used as Lambda function inputs. Teams often spend significant time writing and maintaining transformation logic to normalize these payloads across different event sources.
How Tray.ai Can Help:
tray.ai's visual data mapping interface lets teams inspect, transform, and normalize CloudWatch event payloads into the exact structure a Lambda function expects — no custom transformation code required. When schemas change, updates can be made visually and take effect immediately across all affected workflows.
Challenge
Handling Lambda Execution Timeouts and Retry Logic
Lambda functions invoked by CloudWatch events may time out, throw errors, or need retry logic with exponential backoff — especially during infrastructure incidents when dependent services are themselves degraded. Without proper retry handling, critical remediation functions can silently fail at exactly the wrong moment.
How Tray.ai Can Help:
tray.ai provides built-in error handling, retry configuration, and dead-letter queue routing at the workflow level, so Lambda invocation failures are captured, retried on a configurable schedule, and escalated to the appropriate team if they exceed the maximum retry count.
Challenge
Cross-Account and Cross-Region CloudWatch Event Routing
Large enterprises often operate multiple AWS accounts and regions, making it difficult to build centralized workflows that respond to CloudWatch alarms from different organizational units without complex cross-account IAM configurations and event bus routing rules.
How Tray.ai Can Help:
tray.ai can connect to multiple AWS accounts and regions simultaneously using distinct credential sets, so you can run centralized workflow orchestration that spans CloudWatch sources and Lambda execution targets across your entire AWS organization without custom cross-account plumbing.
Challenge
Avoiding Runaway Recursive Lambda Invocations
A poorly configured workflow can create recursive loops where a Lambda function writes to a CloudWatch log, which triggers a metric filter alarm, which invokes the same Lambda function again — generating exponential AWS charges and potentially causing service disruptions within minutes.
How Tray.ai Can Help:
tray.ai's workflow engine includes loop detection, execution rate limiting, and configurable cooldown periods between alarm-triggered invocations, giving teams guardrails that catch recursive execution patterns before they turn into costly incidents.
Challenge
Maintaining Observability Over Automated Remediation Actions
When Lambda functions are automatically invoked by CloudWatch alarms to perform remediation, it can be hard to maintain a clear audit trail of exactly what actions were taken, when, and against which resources — which makes post-incident reviews and compliance audits painful.
How Tray.ai Can Help:
tray.ai logs every workflow execution with full input/output data, timestamps, and execution status. That gives you a complete, queryable audit trail of every CloudWatch-triggered Lambda invocation, the remediation actions taken, and the outcomes returned — independent of AWS CloudTrail.
Start using our pre-built AWS CloudWatch & AWS Lambda templates today
Start from scratch or use one of our pre-built AWS CloudWatch & AWS Lambda templates to quickly solve your most common use cases.
AWS CloudWatch & AWS Lambda Templates
Find pre-built AWS CloudWatch & AWS Lambda solutions for common use cases
Template
CloudWatch Alarm → Lambda Remediation → Slack Notification
This template monitors a CloudWatch alarm for a defined threshold breach, invokes a Lambda function to run automated remediation steps, and sends a detailed Slack notification confirming what happened — including the alarm name, breach value, and remediation outcome.
Steps:
- Trigger workflow when a CloudWatch alarm transitions to ALARM state
- Invoke the designated Lambda remediation function with alarm context as payload
- Parse the Lambda execution response and post a structured summary to a Slack channel
Connectors Used: AWS CloudWatch, AWS Lambda
Template
Scheduled CloudWatch Metrics Report via Lambda to Email
This template uses a CloudWatch Events schedule to invoke a Lambda function at defined intervals, retrieve infrastructure metrics, and generate a formatted summary report delivered via email to engineering and operations stakeholders.
Steps:
- Trigger workflow on a CloudWatch Events cron schedule (e.g., daily at 8 AM UTC)
- Invoke Lambda function to query CloudWatch metrics and aggregate data into a report payload
- Format the report and send it via email or push it to a Confluence page for team visibility
Connectors Used: AWS CloudWatch, AWS Lambda
Template
CloudWatch Log Anomaly → Lambda Enrichment → PagerDuty Incident
This template watches for CloudWatch Logs metric filter matches indicating an anomaly, triggers a Lambda function to enrich the event with additional AWS resource context, and automatically creates a PagerDuty incident with full details pre-populated.
Steps:
- Detect CloudWatch Logs metric filter alarm triggered by a specific log pattern
- Invoke Lambda function to retrieve additional context from AWS APIs (e.g., resource tags, recent deployments)
- Create a PagerDuty incident with enriched details including affected service, severity, and remediation runbook link
Connectors Used: AWS CloudWatch, AWS Lambda
Template
Lambda Error Rate Spike → CloudWatch Alarm → Jira Incident Ticket
This template monitors the CloudWatch Lambda error rate metric and, when a spike crosses a configurable threshold, automatically creates a Jira incident ticket with the function name, error count, time window, and a link to the relevant CloudWatch log group.
Steps:
- CloudWatch alarm triggers when Lambda error rate exceeds the defined threshold percentage
- tray.ai retrieves detailed Lambda execution metrics and recent error log samples from CloudWatch Logs
- Create a Jira issue in the incident tracking project with pre-populated fields and log context attached
Connectors Used: AWS CloudWatch, AWS Lambda
Template
CloudWatch Budget Alarm → Lambda Throttle Function → Finance Notification
This template watches for CloudWatch billing alarms that signal Lambda cost overruns, invokes a Lambda function to apply concurrency throttling to the offending function, and notifies finance and engineering teams with a cost summary report.
Steps:
- CloudWatch billing alarm triggers when Lambda invocation costs exceed the budget threshold
- Invoke Lambda management function to reduce reserved concurrency on the identified over-spending function
- Send cost summary and throttle action confirmation to finance team via email and engineering team via Slack
Connectors Used: AWS CloudWatch, AWS Lambda
Template
Nightly CloudWatch Log Export → Lambda Archiver → S3 Compliance Store
This template runs on a nightly CloudWatch Events schedule to invoke a Lambda function that exports designated CloudWatch Log Groups to an S3 bucket, applies retention tagging, and sends a completion notification to the compliance team.
Steps:
- CloudWatch Events cron trigger fires nightly to initiate the log archiving workflow
- Invoke Lambda archiver function to export specified CloudWatch Log Groups to the designated S3 compliance bucket
- Send archive completion summary including log groups processed, file sizes, and S3 paths to the compliance team via email
Connectors Used: AWS CloudWatch, AWS Lambda