CloudWatch

AWS monitoring streams as a service

3 Major Services

Metrics

  • Used to monitor most AWS services

  • You monitor your environment by configuring and viewing CloudWatch metrics

  • Metrics are specific to each AWS service or resource, and include such metrics as:

    • EC2 per-instance metrics:

      • CPUUtilization

      • CPUCreditUsage

    • ELB metrics:

      • RequestCount

      • UnhealthyHostCount

    • S3 metrics:

      • NumberOfObjects

      • BucketSizeBytes

  • EC2 Standard level monitoring: 5 minute periods (DEV)

  • EC2 Detailed monitoring: 1 minute periods (PROD) at extra charge

  • Cloudwatch Alarms can be created to trigger alerts (or other actions in your AWS accounts, such as an SNS topic) based on threshold you set on CloudWatch metrics

  • Auto Scaling heavily utilizes CloudWatch - relying on threshold and alarms to trigger the addition (or removal) of instances from an auto-scaling group.

Logs

  • Centralized Repository

  • Log Streams (e.g. EC2 CloudWatch Logs Agent)

    • Push streams of logs folders into a central repo to monitor, like Apache Server logs!!!

  • Metrics can be made from Log Streams

Events

  • Create rules to respond to events in Events Stream

    • Trigger Lambda function when EC2 instance enters running state

  • Schedule Events

  • Make simple timers or date-based events

EC2 Monitoring

  • System Status Checks: (Outside of our control)

    • Loss of network connectivity

    • Loss of system power

    • Software issues on the physical host

    • Hardware issues on the physical host

    • How to solve: Generally stopping and restarting the instance will fix the issue. This causes the instance to launch on a different physical hardware device.

  • Instance Status Checks: (software issues in our control)

    • Failed system status checks

    • Misconfigured networking or startup configuration

    • Exhausted memory

    • Corrupted file system

    • Incompatible kernel

    • How to solve: Generally a reboot, or solving the file system config issue

  • By default, CloudWatch will automatically monitor metrics that can be viewed at the host level (not the software level), such as:

    • CPUUtilization

    • Network in/out

    • CPUCreditBalance

    • CPUCreditUsage

  • OS level metrics that required a third party script (perl) to be installed (provided by AWS)

    • Memory utilization, memory used, and memory available

    • Disk Swap utilization

    • Disk space utilization, disk space used, disk space available.

CloudWatch Alarms

  • Allow for you (or the sysadmin) to be notified when certain defined thresholds are met on CloudWatch Metrics

  • For example, you can setup an alarm to be triggered whenever the CPUUtilization metric on an EC2 instance goes above 70%

  • Alarms can also be used to trigger other events in AWS like publishing to an SNS topic or triggering auto scaling.

Last updated

Was this helpful?