Summary: This wiki page demonstrates setting up CloudWatch monitoring for a DokuWiki instance on AWS Lightsail, including CloudWatch agent installation with custom Apache log collection, Vector integration for systemd journal logs, and dashboard/alarm configuration.
Date: 14 July 2025
In my previous page I describe how I setup dokuwiki on an AWS lightsail instance. In this page I will describe how to setup Cloudwatch for monitoring the instance.
Overall, the following techniques are used:
The Cloudwatch agent is used to collect metrics and logs from the instance and send them to Cloudwatch.
We need an IAM user with the required permissions to allow the Cloudwatch agent to send metrics and logs to Cloudwatch. Follow these steps:
Now we can download and install the cloudwatch agent:
# download the latest cloudwatch agent package wget https://s3.amazonaws.com/amazoncloudwatch-agent/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb sudo dpkg -i -E ./amazon-cloudwatch-agent.deb # Setup the credentials so that the cloudwatch agent can write to cloudwatch sudo aws configure --profile AmazonCloudWatchAgent AWS Access Key ID [None]: See Lastpass AWS Secret Access Key [None]: See Lastpass Default region name [None]: Default output format [None]:
Now we can create an initial configuration for the agent using the config wizard. Try to set the values as much as possible as below, but we will change the config file afterwards to include the required metrics and logs.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
and use the following settings:Choose Standard to create a basic setup
Edit the file afterwards as explained here, so now open the config file: sudo vi /opt/aws/amazon-cloudwatch-agent/bin/config.json
:
{ "agent": { "metrics_collection_interval": 60, "run_as_user": "root" }, "logs": { "logs_collected": { "files": { "collect_list": [ { "file_path": "/opt/bitnami/apache/logs/access_log_cloudwatch", "log_group_name": "apache/access", "log_stream_name": "ApacheAccess", "retention_in_days": 90 }, { "file_path": "/opt/bitnami/apache/logs/error_log_cloudwatch", "log_group_name": "apache/error", "log_stream_name": "ApacheError", "retention_in_days": 90 }, { "file_path": "/var/log/dpkg.log", "log_group_name": "dpkg-logs", "log_stream_name": "dpkg", "retention_in_days": 90 } ] } } }, "metrics": { "metrics_collected": { "cpu": { "measurement": [ "cpu_usage_idle", "cpu_usage_iowait", "cpu_usage_user", "cpu_usage_system", "cpu_usage_active" ], "metrics_collection_interval": 60, "totalcpu": true }, "disk": { "measurement": [ "used_percent" ], "metrics_collection_interval": 60, "resources": [ "*" ] }, "diskio": { "measurement": [ "io_time" ], "metrics_collection_interval": 60, "resources": [ "*" ] }, "mem": { "measurement": [ "mem_used_percent" ], "metrics_collection_interval": 60 }, "statsd": { "metrics_aggregation_interval": 60, "metrics_collection_interval": 60, "service_address": ":8125" }, "swap": { "measurement": [ "swap_used_percent" ], "metrics_collection_interval": 60 }, "processes": { "measurement": [ "total", "idle", "wait", "running", "sleeping", "dead", "zombies" ] } } } }
Note that we also added the/var/log/dpkg.log
log file to the configuration, which is used for monitoring package installations and updates.
Now we need to configure the credentials:
sudo vi /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml # Uncomment and edit the following lines: [credentials] shared_credential_profile = "AmazonCloudWatchAgent"
As you can see in the config file above, we will collect apache logs from a custum log, which we need to configure. For that we will follow the tutorial from here. We need to change the logging section in the Apache setup. To make the changes a bit more clear, I'll first show the original logging section, and then the new logging section with the changes.
Open the apache config: sudo vi /opt/bitnami/apache/conf/httpd.conf
Original logging section
# # ErrorLog: The location of the error log file. # If you do not specify an ErrorLog directive within a <VirtualHost> # container, error messages relating to that virtual host will be # logged here. If you *do* define an error logfile for a <VirtualHost> # container, that host's errors will be logged there and not here. # ErrorLog "logs/error_log" # # LogLevel: Control the number of messages logged to the error_log. # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. # LogLevel warn <IfModule log_config_module> # # The following directives define some format nicknames for use with # a CustomLog directive (see below). # LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined LogFormat "%h %l %u %t \"%r\" %>s %b" common <IfModule logio_module> # You need to enable mod_logio.c to use %I and %O LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio </IfModule> # # The location and format of the access logfile (Common Logfile Format). # If you do not define any access logfiles within a <VirtualHost> # container, they will be logged here. Contrariwise, if you *do* # define per-<VirtualHost> access logfiles, transactions will be # logged therein and *not* in this file. # CustomLog "logs/access_log" common # # If you prefer a logfile with access, agent, and referer information # (Combined Logfile Format) you can use the following directive. # #CustomLog "logs/access_log" combined </IfModule>
Updated logging section
# # ErrorLog: The location of the error log file. # If you do not specify an ErrorLog directive within a <VirtualHost> # container, error messages relating to that virtual host will be # logged here. If you *do* define an error logfile for a <VirtualHost> # container, that host's errors will be logged there and not here. # ErrorLog "/opt/bitnami/apache/logs/error_log_cloudwatch" ErrorLogFormat "{\"time\":\"%{%usec_frac}t\", \"function\" : \"[%-m:%l]\" , \"process\" : \"[pid%P]\" ,\"message\" : \"%M\"}" # # LogLevel: Control the number of messages logged to the error_log. # Possible values include: debug, info, notice, warn, error, crit, # alert, emerg. # LogLevel warn <IfModule log_config_module> # # The following directives define some format nicknames for use with # a CustomLog directive (see below). # LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined LogFormat "%h %l %u %t \"%r\" %>s %b" common LogFormat "{ \"time\":\"%{%Y-%m-%d}tT%{%T}t.%{msec_frac}tZ\", \"process\":\"%D\", \"filename\":\"%f\", \"remoteIP\":\"%a\", \"host\":\"%V\", \"request\":\"%U\",\"query\":\"%q\",\"method\":\"%m\", \"status\":\"%>s\", \"userAgent\":\"%{User-agent}i\",\"referer\":\"%{Referer}i\"}" cloudwatch <IfModule logio_module> # You need to enable mod_logio.c to use %I and %O LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio </IfModule> # # The location and format of the access logfile (Common Logfile Format). # If you do not define any access logfiles within a <VirtualHost> # container, they will be logged here. Contrariwise, if you *do* # define per-<VirtualHost> access logfiles, transactions will be # logged therein and *not* in this file. # CustomLog "/opt/bitnami/apache/logs/access_log_cloudwatch" cloudwatch # # If you prefer a logfile with access, agent, and referer information # (Combined Logfile Format) you can use the following directive. # #CustomLog "logs/access_log" combined </IfModule>
Now we need to restart the apache server to apply the changes:
sudo /opt/bitnami/ctlscript.sh restart apache
To start, or restart the Cloudwatch agent, we can use the following commands:
sudo amazon-cloudwatch-agent-ctl -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -a fetch-config -s
To check the status of the Cloudwatch agent, we can use the following command: sudo amazon-cloudwatch-agent-ctl -a status
In case something doesn't work, you can check the cloudwatch agent log:
tail -f /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log
For all the metrics widgets below, you need to add a new widget of data type “Metrics” and a widget type “Line”. Then go to the source and then add the json as shown below.
To add log widgets, go to the cloudwatch console and select the Logs Insights tab. Then select the log group you want to query, which in this case is `/apache/access` for the access logs and `/apache/error` for the error logs. You can then use the queries below to get insights into the logs. Then, when the query has run, you can click “Add to dashboard” and select the dashboard you want to add the widget to.
{ "metrics": [ [ "CWAgent", "processes_running", "host", "wiki", { "region": "eu-west-1", "label": "Running" } ], [ ".", "processes_sleeping", ".", ".", { "region": "eu-west-1", "label": "Sleeping" } ], [ ".", "processes_dead", ".", ".", { "region": "eu-west-1", "label": "Dead" } ], [ ".", "processes_zombies", ".", ".", { "region": "eu-west-1", "label": "Zombie" } ], [ ".", "processes_total", ".", ".", { "region": "eu-west-1", "label": "Total" } ], [ ".", "processes_idle", ".", ".", { "region": "eu-west-1", "label": "Idle" } ] ], "view": "timeSeries", "stacked": false, "region": "eu-west-1", "period": 300, "stat": "Average", "title": "wiki.getshifting.com - Processes" }
{ "metrics": [ [ "CWAgent", "mem_used_percent", "host", "wiki", { "label": "Memory usage", "region": "eu-west-1" } ], [ ".", "swap_used_percent", ".", ".", { "label": "Swap usage", "region": "eu-west-1" } ] ], "view": "timeSeries", "stacked": false, "region": "eu-west-1", "title": "wiki.getshifting.com - Memory", "period": 300, "stat": "Average" }
{ "metrics": [ [ "CWAgent", "disk_used_percent", "path", "/", "host", "wiki", "device", "nvme0n1p1", "fstype", "ext4", { "label": "Disk Space Usage", "region": "eu-west-1" } ] ], "view": "timeSeries", "stacked": false, "region": "eu-west-1", "title": "wiki.getshifting.com - Disk Usage", "period": 300, "stat": "Average" }
{ "metrics": [ [ "CWAgent", "diskio_io_time", "host", "wiki", "name", "nvme0n1p1", { "label": "Disk IO Time (The amount of time that the disk has had I/O requests queued)", "region": "eu-west-1" } ] ], "view": "timeSeries", "stacked": false, "region": "eu-west-1", "title": "wiki.getshifting.com - Disk IO Time", "period": 300, "stat": "Average" }
{ "metrics": [ [ "CWAgent", "cpu_usage_user", "host", "wiki", "cpu", "cpu-total", { "region": "eu-west-1", "label": "User" } ], [ ".", "cpu_usage_system", ".", ".", ".", ".", { "region": "eu-west-1", "label": "System" } ], [ ".", "cpu_usage_iowait", ".", ".", ".", ".", { "region": "eu-west-1", "label": "IO Wait" } ], [ ".", "cpu_usage_idle", ".", ".", ".", ".", { "region": "eu-west-1", "visible": false, "label": "Idle" } ], [ ".", "cpu_usage_active", ".", ".", ".", ".", { "region": "eu-west-1", "label": "Active" } ] ], "view": "timeSeries", "stacked": false, "region": "eu-west-1", "period": 300, "title": "wiki.getshifting.com - CPU", "stat": "Average" }
fields @timestamp, remoteIP, method, status | filter status="200" and method= "GET" | stats count_distinct(remoteIP) as UniqueVisits
fields @timestamp, message | limit 20
fields @timestamp, message | limit 10
Traditionally, log files on a linux system were stored in the `/var/log` directory, but nowadays on systemd-based systems, the logs are stored in the systemd journal. You could check cat /var/log/README
for confirmation. To still be able to send the logs to cloudwatch, we'll configure Vector.dev, which is a tool from datadog, to send the journalctl entries to cloudwatch.
We need an IAM user with the required permissions to allow the vector agent to send logs to Cloudwatch. Follow these steps:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "CloudWatchLogsPermissions", "Effect": "Allow", "Action": [ "logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents", "logs:DescribeLogGroups", "logs:DescribeLogStreams", "logs:ListTagsLogGroup" ], "Resource": "*" } ] }
Now we can create a new credentials file for the vector agent:
sjoerd@wiki:~$ aws configure --profile VectorAgent AWS Access Key ID [None]: See Lastpass AWS Secret Access Key [None]: See Lastpass Default region name [None]: eu-west-1 Default output format [None]:
Now we need to set permissions so that vector can read the credentials file:
chmod o+r .aws/credentials
As lightsail uses the dpkg package manager, we can install vector using the following steps:
curl \ --proto '=https' \ --tlsv1.2 -O \ https://apt.vector.dev/pool/v/ve/vector_0.48.0-1_amd64.deb sudo dpkg -i vector_0.48.0-1_amd64.deb
- Configure vector: sudo vi /etc/vector/vector.yaml
cat /etc/vector/vector.yaml | grep -v '^\s*$\|^\s*\#'
sources: journald_source: type: "journald" sinks: cloudwatch_sink: type: "aws_cloudwatch_logs" auth: credentials_file: "/home/sjoerd/.aws/credentials" profile: "VectorAgent" inputs: - "journald_source" compression: "gzip" encoding: codec: "json" region: "eu-west-1" group_name: "systemd-journal" stream_name: "journalctl"
- Validate vector config: sudo vector validate /etc/vector/vector.yaml
- Start vector: sudo systemctl start vector
- Enable vector: sudo systemctl enable vector
- Check the status: sudo systemctl status vector
sudo journalctl -u vector.service
fields @timestamp, message | limit 50
We want to monitor the certificate expiration date:
You can also set a label and provide a new name. Then this will be the source:
{ "metrics": [ [ "AWS/CertificateManager", "DaysToExpiry", "CertificateArn", "arn:aws:acm:us-east-1:410123456772:certificate/175bbc5b-cd9b-45b2-b906-059e12589237", { "region": "us-east-1", "label": "getshifting.com" } ], [ "...", "arn:aws:acm:us-east-1:410123456772:certificate/2598de1a-fea6-40c0-9296-e6cb18ae8a26", { "region": "us-east-1", "label": "wiki.getshifting.com" } ] ], "sparkline": true, "view": "singleValue", "region": "us-east-1", "period": 300, "stat": "Average", "title": "Certificate - DaysToExpire" }
Afterwards, you need to change the permissons to allow acces to the loggroups and alarms. Click on IAM role, from the sharing overview and change the policy as below:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ec2:DescribeTags", "cloudwatch:GetMetricData" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "cloudwatch:GetInsightRuleReport", "cloudwatch:DescribeAlarms", "cloudwatch:GetDashboard" ], "Resource": [ "arn:aws:cloudwatch::410123456772:dashboard/GetShiftingDashboard" ] }, { "Effect": "Allow", "Action": [ "logs:FilterLogEvents", "logs:StartQuery", "logs:StopQuery", "logs:GetLogRecord", "logs:DescribeLogGroups" ], "Resource": [ "arn:aws:logs:eu-west-1:410123456772:log-group:apache/access:*", "arn:aws:logs:eu-west-1:410123456772:log-group:apache/error:*", "arn:aws:logs:eu-west-1:410123456772:log-group:dpkg-logs:*", "arn:aws:logs:eu-west-1:410123456772:log-group:systemd-journal:*" ] }, { "Effect": "Allow", "Action": "cloudwatch:DescribeAlarms", "Resource": "*" } ] }
We want to be notified when the root disk is almost full, so we will create an alarm for that. We will use the Cloudwatch agent metrics to monitor the disk usage.
First we need to create an SNS topic to send the alarm notifications to. Follow these steps:
Now we need to subscribe to the topic, so we can receive the notifications:
If required, you can test the subscription by publishing a test message to the topic, through the 'Publish message' option in the topic details page.
We want to create an alarm that will notify us when the root disk is almost full. We will use the Cloudwatch agent metrics to monitor the disk usage. The alarm will be triggered when the disk usage exceeds 90%.
Go to the Cloudwatch console and follow these steps: