httpGet
described in my previous blog post (Kubernetes Readiness and Liveness with Spring Boot Actuator) is not an option because there is no endpoint to reference. These can be deployed with the Apache Kafka REST Proxy, which gets us on the right path, but doesn't quite work how we want in this respect.
The Kafka REST Proxy provides endpoints that allow one to get some basic status info about connectors. However, the standard Kubernetes
httpGet
calls use status code >= 200
and < 400 to determine the status, and since the Kafka REST status
endpoint always provides a 200 status code, it is not possible to use
this methodology to determine if a connector is
down.What we would like to do is check the content of the status call, and do a string comparison. For example, when the service is up, the status endpoint indicates that the state is "
RUNNING
":# curl http://10.30.128.1:8083/connectors/mysql-kafka-connector/status |
{"name":"mysql-kafka-connector","connector":{"state":"RUNNING","worker_id":"
|
We can pause the connector using this endpoint:
# curl -i -X PUT http://
|
HTTP/1.1 202 Accepted |
And then the state is changed to
PAUSED
:# curl http://
|
{"name":"mysql-kafka-connector","connector":{"state":"PAUSED","worker_id":"
|
To accomplish this check, we can leverage the exec command
probe:readinessProbe:
exec:
command:
- /bin/sh
- -c
- curl -s http://127.0.0.1:8083/connectors/mysql-kafka-connector
/status | grep "RUNNING"
initialDelaySeconds: 240
periodSeconds: 5
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 10
livenessProbe:
exec:
command:
- /bin/sh
- -c
- curl -s http://127.0.0.1:8083/connectors/mysql-kafka-connector
/status | grep "RUNNING"
initialDelaySeconds: 300
periodSeconds: 60
timeoutSeconds: 10
successThreshold: 1
failureThreshold: 3
The the exec command allows us to execute a shell command. In this case:
- running the shell (/bin/sh)
- telling it to run a single command (-c)
- with the command being a cURL call to the specific connector status endpoint, and grepping for the string "RUNNING"
To get the service running again and start passing readiness and liveness again, then you will want to use the RESUME endpoint.
# curl -i -X PUT http://
|
HTTP/1.1 202 Accepted |
No comments:
Post a Comment