Cloud Foundry provides several built-in mechanisms that allow us to monitor our applications' state changes and behavior. Additionally, Cloud Foundry actively monitors the health of our application processes and will restart them should they crash. In this lab, we’ll explore a few of these mechanisms.
Cloud Foundry only allows application configuration to be modified via its API. This gives application operators confidence that all changes to application configuration are known and auditable. It also reduces the number of causes that must be considered when problems arise.
All application configuration changes are recorded as events. These events can be viewed via the Cloud Foundry API, and viewing is facilitated via the CLI.
-
Take a look at the events that have transpired so far for our deployment of
orders-service-${initials}
:$ cf events orders-service-${initials} Getting events for app orders-service-${initials} in org Workshop / space user1 as [email protected]... time event actor description 2015-02-13T15:18:33.00-0600 audit.app.update [email protected] instances: 1 (6) 2015-02-13T15:04:34.00-0600 audit.app.update [email protected] instances: 3 (5) 2015-02-13T12:56:35.00-0600 audit.app.update [email protected] state: STARTED (4) 2015-02-13T12:56:26.00-0600 audit.app.update [email protected] (3) 2015-02-13T12:56:26.00-0600 audit.app.map-route [email protected] (2) 2015-02-13T12:56:24.00-0600 audit.app.create [email protected] instances: 1, memory: 512, state: STOPPED, environment_json: PRIVATE DATA HIDDEN (1)
-
Events are sorted newest to oldest, so we’ll start from the bottom. Here we see the
app.create
event, which created our application’s record and stored all of its metadata (e.g.memory: 512
). -
The
app.map-route
event records the incoming request to assign a route to our application. -
This
app.update
event records the resulting change to our applications metadata. -
This
app.update
event records the change of our application’s state toSTARTED
. -
Remember scaling the application up? This
app.update
event records the metadata changeinstances: 3
. -
And here’s the
app.update
event recording our scaling of the application back down withinstances: 1
.
-
-
Let’s explicitly ask for the application to be stopped:
$ cf stop orders-service-${initials} Stopping app orders-service-${initials} in org Workshop / space user1 as [email protected]... OK
-
Now, examine the additional
app.update
event:$ cf events orders-service-${initials} Getting events for app orders-service-${initials} in org Workshop / space user1 as [email protected]... time event actor description 2015-02-13T15:59:10.00-0600 audit.app.update [email protected] state: STOPPED 2015-02-13T15:18:33.00-0600 audit.app.update [email protected] instances: 1 2015-02-13T15:04:34.00-0600 audit.app.update [email protected] instances: 3 2015-02-13T12:56:35.00-0600 audit.app.update [email protected] state: STARTED 2015-02-13T12:56:26.00-0600 audit.app.update [email protected] 2015-02-13T12:56:26.00-0600 audit.app.map-route [email protected] 2015-02-13T12:56:24.00-0600 audit.app.create [email protected] instances: 1, memory: 512, state: STOPPED, environment_json: PRIVATE DATA HIDDEN
-
Start the application again:
$ cf start orders-service-${initials} Starting app cf-scale-boot in org Workshop / space user1 as [email protected]... 0 of 1 instances running, 1 starting 0 of 1 instances running, 1 starting 0 of 1 instances running, 1 starting 1 of 1 instances running App started OK App orders-service-${initials} was started using this command `JAVA_HOME=$PWD/.java-buildpack/open_jdk_jre JAVA_OPTS="-Djava.io.tmpdir=$TMPDIR -XX:OnOutOfMemoryError=$PWD/.java-buildpack/open_jdk_jre/bin/killjava.sh -Xmx382293K -Xms382293K -XX:MaxMetaspaceSize=64M -XX:MetaspaceSize=64M -Xss995K" SERVER_PORT=$PORT $PWD/.java-buildpack/spring_boot_cli/bin/spring run app.groovy` Showing health and status for app orders-service-${initials} in org Workshop / space user1 as [email protected]... OK requested state: started instances: 1/1 usage: 512M x 1 instances urls: orders-service-${initials}-stockinged-rust.cfapps.io last uploaded: Fri Feb 13 18:56:29 UTC 2015 state since cpu memory disk #0 running 2015-02-13 04:01:50 PM 0.0% 389.1M of 512M 128.9M of 1G
-
And again, view the additional
app.update
event:$ cf events orders-service-${initials} Getting events for app orders-service-${initials} in org Workshop / space user1 as [email protected]... time event actor description 2015-02-13T16:01:28.00-0600 audit.app.update [email protected] state: STARTED 2015-02-13T15:59:10.00-0600 audit.app.update [email protected] state: STOPPED 2015-02-13T15:18:33.00-0600 audit.app.update [email protected] instances: 1 2015-02-13T15:04:34.00-0600 audit.app.update [email protected] instances: 3 2015-02-13T12:56:35.00-0600 audit.app.update [email protected] state: STARTED 2015-02-13T12:56:26.00-0600 audit.app.update [email protected] 2015-02-13T12:56:26.00-0600 audit.app.map-route [email protected] 2015-02-13T12:56:24.00-0600 audit.app.create [email protected] instances: 1, memory: 512, state: STOPPED, environment_json: PRIVATE DATA HIDDEN
One of the most important enablers of visibility into application behavior is logging.
Effective management of logs has historically been very difficult.
Cloud Foundry’s log aggregation components simplify log management by assuming responsibility for it.
Application developers need only log all messages to either STDOUT
or STDERR
, and the platform will capture these messages.
Application developers can view application logs using the CF CLI.
-
Let’s view recent log messages for
orders-service-${initials}
:$ cf logs orders-service-${initials} --recent
Here are two interesting subsets of one output from that command:
Example 1. CF Component Logs2015-02-13T14:45:39.40-0600 [RTR/0] OUT orders-service-${initials}-stockinged-rust.cfapps.io - [13/02/2015:20:45:39 +0000] "GET /css/bootstrap.min.css HTTP/1.1" 304 0 "http://orders-service-${initials}-stockinged-rust.cfapps.io/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36" 10.10.66.88:50372 x_forwarded_for:"50.157.39.197" vcap_request_id:84cc1b7a-bb30-4355-7512-5adaf36ff767 response_time:0.013115764 app_id:7a428901-1691-4cce-b7f6-62d186c5cb55 (1) 2015-02-13T14:45:39.40-0600 [RTR/1] OUT orders-service-${initials}-stockinged-rust.cfapps.io - [13/02/2015:20:45:39 +0000] "GET /img/LOGO_CloudFoundry_Large.png HTTP/1.1" 304 0 "http://orders-service-${initials}-stockinged-rust.cfapps.io/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36" 10.10.66.88:24323 x_forwarded_for:"50.157.39.197" vcap_request_id:b3e2466b-6a41-4c6d-5b3d-0f70702c0ec1 response_time:0.010003444 app_id:7a428901-1691-4cce-b7f6-62d186c5cb55 2015-02-13T15:04:33.09-0600 [API/1] OUT Tried to stop app that never received a start event (2) 2015-02-13T15:04:33.51-0600 [CELL/12] OUT Starting app instance (index 2) with guid 7a428901-1691-4cce-b7f6-62d186c5cb55 (3) 2015-02-13T15:04:33.71-0600 [CELL/4] OUT Starting app instance (index 3) with guid 7a428901-1691-4cce-b7f6-62d186c5cb55
-
An “Apache-style” access log event from the (Go)Router
-
An API log event that corresponds to an event as shown in
cf events
-
A CELL log event indicating the start of an application instance on that CELL.
Example 2. Application Logs2015-02-13T16:01:50.28-0600 [App/0] OUT 2015-02-13 22:01:50.282 INFO 36 --- [ runner-0] o.s.b.a.e.jmx.EndpointMBeanExporter : Located managed bean 'autoConfigurationAuditEndpoint': registering with JMX server as MBean [org.springframework.boot:type=Endpoint,name=autoConfigurationAuditEndpoint] 2015-02-13T16:01:50.28-0600 [App/0] OUT 2015-02-13 22:01:50.287 INFO 36 --- [ runner-0] o.s.b.a.e.jmx.EndpointMBeanExporter : Located managed bean 'shutdownEndpoint': registering with JMX server as MBean [org.springframework.boot:type=Endpoint,name=shutdownEndpoint] 2015-02-13T16:01:50.29-0600 [App/0] OUT 2015-02-13 22:01:50.299 INFO 36 --- [ runner-0] o.s.b.a.e.jmx.EndpointMBeanExporter : Located managed bean 'configurationPropertiesReportEndpoint': registering with JMX server as MBean [org.springframework.boot:type=Endpoint,name=configurationPropertiesReportEndpoint] 2015-02-13T16:01:50.36-0600 [App/0] OUT 2015-02-13 22:01:50.359 INFO 36 --- [ runner-0] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 61316/http 2015-02-13T16:01:50.36-0600 [App/0] OUT Started... 2015-02-13T16:01:50.36-0600 [App/0] OUT 2015-02-13 22:01:50.364 INFO 36 --- [ runner-0] o.s.boot.SpringApplication : Started application in 6.906 seconds (JVM running for 15.65)
As you can see, Cloud Foundry’s log aggregation components capture both application logs and CF component logs relevant to your application. These events are properly interleaved based on time, giving you an accurate picture of events as they transpired across the system.
-
-
To get a running “tail” of the application logs rather than a dump, simply type:
$ cf logs orders-service-${initials}
You can try various things like refreshing the browser and triggering stop/start events to see logs being generated. Stop
cf logs <app_name>
output by pressing Ctrl-C.
Cloud Foundry actively monitors the health of our application processes and will restart them should they crash.
In this lab, you may want to have two terminal windows open to watch
cf logs
while reviewingcf events
.
-
If you don’t have a terminal window already running, start a log tail for
orders-service-${initials}
:$ cf logs orders-service-${initials}
-
Let’s stop the 3rd application instance of your app. See how Cloud Foundry automatically creates another app instance to match my desired capacity
$ cf restart-app-instance orders-service-${initials} 2
-
Look at the application log now, you’ll see some interesting log messages fly by:
2019-10-01T22:18:29.13-0500 [CELL/3] OUT Cell 9aa90abe-6a8f-4485-90d1-71da907de9a3 stopping instance 36a34607-b500-461d-6330-d1fb 2019-10-01T22:18:34.51-0500 [CELL/SSHD/3] OUT Exit status 0 2019-10-01T22:18:34.51-0500 [APP/PROC/WEB/3] OUT 2019-10-02 03:18:34.511 INFO 17 --- [ Thread-9] o.s.s.concurrent.ThreadPoolTaskExecutor : Shutting down ExecutorService 'applicationTaskExecutor' 2019-10-01T22:18:34.51-0500 [APP/PROC/WEB/3] OUT 2019-10-02 03:18:34.514 INFO 17 --- [ Thread-9] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default' 2019-10-01T22:18:35.04-0500 [APP/PROC/WEB/3] OUT Exit status 143 2019-10-01T22:18:35.06-0500 [CELL/3] OUT Cell 9aa90abe-6a8f-4485-90d1-71da907de9a3 destroying container for instance 36a34607-b500-461d-6330-d1fb 2019-10-01T22:18:35.37-0500 [PROXY/3] OUT Exit status 137 2019-10-01T22:18:35.97-0500 [CELL/3] OUT Cell 9aa90abe-6a8f-4485-90d1-71da907de9a3 successfully destroyed container for instance 36a34607-b500-461d-6330-d1fb
-
Check the application events to see another indicator of the crash. Do this in another terminal window or exit
cf logs
by pressing Ctrl-C:$ cf events orders-service-${initials} Getting events for app orders-service-${initials} in org Workshop / space user1 as [email protected]... time event actor description 2015-02-13T17:17:57.00-0600 app.crash orders-service-${initials} index: 0, reason: CRASHED, exit_description: app instance exited, exit_status: 1
-
By this time you should have noticed some additional interesting events in the logs:
2019-10-01T22:19:01.44-0500 [CELL/3] OUT Cell 2e2fbcce-0d89-4f32-8458-be88c37d4533 creating container for instance 851b3251-df69-4c7a-4d03-341c 2019-10-01T22:19:02.45-0500 [CELL/3] OUT Cell 2e2fbcce-0d89-4f32-8458-be88c37d4533 successfully created container for instance 851b3251-df69-4c7a-4d03-341c 2019-10-01T22:19:02.65-0500 [CELL/3] OUT Downloading droplet... 2019-10-01T22:19:06.62-0500 [CELL/3] OUT Downloaded droplet (89.5M) 2019-10-01T22:19:06.62-0500 [CELL/3] OUT Starting health monitoring of container 2019-10-01T22:19:07.73-0500 [APP/PROC/WEB/3] OUT JVM Memory Configuration: -Xmx145122K -Xss1M -XX:ReservedCodeCacheSize=240M -XX:MaxDirectMemorySize=10M -XX:MaxMetaspaceSize=129309K 2019-10-01T22:19:11.87-0500 [APP/PROC/WEB/3] OUT . ____ _ __ _ _ 2019-10-01T22:19:11.87-0500 [APP/PROC/WEB/3] OUT /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \ 2019-10-01T22:19:11.87-0500 [APP/PROC/WEB/3] OUT ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \ 2019-10-01T22:19:11.87-0500 [APP/PROC/WEB/3] OUT \\/ ___)| |_)| | | | | || (_| | ) ) ) ) 2019-10-01T22:19:11.87-0500 [APP/PROC/WEB/3] OUT ' |____| .__|_| |_|_| |_\__, | / / / / 2019-10-01T22:19:11.87-0500 [APP/PROC/WEB/3] OUT =========|_|==============|___/=/_/_/_/ 2019-10-01T22:19:11.88-0500 [APP/PROC/WEB/3] OUT :: Spring Boot :: (v2.1.8.RELEASE)
The CELL indicates that it is starting another instance of the application as a result of the Diego sub-system observing a difference between the desired and actual state (i.e. running instances = 1 vs. running instances = 0).
The new application instance starts logging events as it starts up.
Application operators will also enjoy commands like cf logs
, but are often interested in long-term retention, indexing, and analysis of logs as well.
Cloud Foundry currently only provides short-term retention of logs.
To meet these needs, Cloud Foundry provides the ability to drain logs to third-party providers.
In this section, we’ll drain logs to a very simple provider called Papertrail.
-
Visit https://papertrailapp.com and create a free account.
-
Login to your account and add your first system:
-
Click on “Alternatives”:
-
Choose “I use Heroku” and provide a name:
-
Note the URL + Port assigned to your application:
-
We’ll use a Cloud Foundry user-provided service instance to create the log drain for our application using the URL + Port provided by Papertrail:
$ cf cups orders-service-${initials}-logs -l syslog://logs2.papertrailapp.com:43882 Creating user provided service orders-service-${initials}-logs in org Workshop / space user1 as [email protected]... OK
-
We bind that service instance like those we created in Lab 2 (
cf bs
is short forcf bind-service
):$ cf bs orders-service-${initials} orders-service-${initials}-logs Binding service orders-service-${initials}-logs to app orders-service-${initials} in org Workshop / space user1 as [email protected]... OK TIP: Use 'cf restage' to ensure your env variable changes take effect
-
We’ll use a
cf restart
rather thancf restage
to make the binding take effect:$ cf restart orders-service-${initials}
-
Refresh the Papertrail browser tab to see log events immediately flowing to the log viewing page:
You can see how to connect to other third-party log management systems in the Cloud Foundry documentation.
Because of the limited quota we have for this course, let’s clean up our application and services to make room for future labs.
-
Delete the
orders-service-${initials}
application (cf d
is short forcf delete
):$ cf d orders-service-${initials} Really delete the app orders-service-${initials}?> y Deleting app orders-service-${initials} in org Workshop / space user1 as [email protected]... 'OK
-
Delete the
orders-service-${initials}-logs
service (cf ds
is short forcf delete-service
):$ cf ds orders-service-${initials}-logs Really delete the service orders-service-${initials}-logs?> y Deleting service orders-service-${initials}-logs in org Workshop / space user1 as [email protected]... OK