Background
The Confluent Cloud Console is migrating to a new backend service (also known as Kafka REST) replacing its old backend service (also known as Kafka-API).
Customers with private networking setups using PrivateLink, Peering, or Transit Gateway clusters, needing to locally access their Confluent Cloud clusters, should already have configured access to the old service when their cluster was originally provisioned.
- Customers should be able to easily spot it in their configurations as the FQDN starts with either lkaclkc
(for PrivateLink clusters) or pkac
(for all other clusters).
In order to enable a smooth and uneventful transition, the DNS or proxy configurations used by customers should be modified to also allow access to the new Kafka REST service in addition to the old service. Having both services accessible simultaneously will allow rolling out the migration without any further action needed.
The Kafka REST service in Confluent Cloud has already been enabled for all clusters and should be reachable (e.g. via curl) after successful configuration (See Troubleshooting Guide section).
The Kafka REST FQDN is the same as the FQDN of the Kafka bootstrap server of the corresponding Confluent Cloud cluster but uses port 443 being a RESTful HTTP service.
- The new service FQDN starts with either lkc
(for PrivateLink clusters) or pkc
(for all other clusters).
- As the new service FQDN is the same as the Kafka bootstrap server, existing configurations may already allow traffic over port 9092 to support the Kafka protocol.
What needs to be done?
Re-iterating email outreach conducted in January 2023 - the DNS or proxy configurations used to allow access to the old (Kafka-API) service should be modified to also allow access to the new (Kafka REST) service. As mentioned above, both services should be accessible simultaneously to allow seamless transition from the old one to the new one.
If no action is performed, users may lose access to certain UI screens that previously used the Kafka-API endpoint. There is no impact to Kafka produce/consume operations.
Troubleshooting Guide
Configure and Verify Proxy
Both the old and the new backend service are currently enabled for each Confluent Cloud cluster. Connectivity from within the proxy to Confluent, and from your local environment to the proxy instance is required. To ensure that your configuration has been correctly modified, follow the steps in the next sections.
Verify connectivity to the Kafka-API service
# use public internet DNS to resolve Confluent Cloud DNS
% dig @8.8.8.8 +short pkac-a1b2c3.us-central1.gcp.confluent.cloud
10.50.0.2
# from proxy instance, test connectivity directly to Confluent
% curl -I --resolve pkac-a1b2c3.us-central1.gcp.confluent.cloud:443:10.50.0.2 https://pkac-a1b2c3.us-central1.gcp.confluent.cloud
HTTP/1.1 401 Unauthorized
WWW-Authenticate: basic realm=""
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=iso-8859-1
Content-Length: 329
# above is an example of a SUCCESSFUL response (ignore the 401 Unauthorized bit)
# an UNSUCCESSFUL response will hang and timeout
# once previous step is successful, next step is to validate proxy configuration (expected to receive same successful response above)
% curl -I --resolve pkac-a1b2c3.us-central1.gcp.confluent.cloud:443:127.0.0.1 https://pkac-a1b2c3.us-central1.gcp.confluent.cloud
HTTP/1.1 401 Unauthorized
WWW-Authenticate: basic realm=""
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=iso-8859-1
Content-Length: 329
# success!
Verify connectivity to Kafka REST
# use public internet DNS to resolve Confluent Cloud DNS
% dig @8.8.8.8 +short pkc-abcdef.us-central1.gcp.confluent.cloud
10.50.0.3
# from proxy instance, test connectivity directly to Confluent
% curl -I --resolve pkc-abcdef.us-central1.gcp.confluent.cloud:443:10.50.0.3 https://pkc-abcdef.us-central1.gcp.confluent.cloud
HTTP/1.1 401 Unauthorized
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=iso-8859-1
Content-Length: 356
# above is an example of a SUCCESSFUL response (ignore the 401 Unauthorized bit)
# an UNSUCCESSFUL response will hang and timeout
# once previous step is successful, next step is to validate proxy configuration (expected to receive same successful response above)
% curl -I --resolve pkc-abcdef.us-central1.gcp.confluent.cloud:443:127.0.0.1 https://pkc-abcdef.us-central1.gcp.confluent.cloud
HTTP/1.1 401 Unauthorized
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=iso-8859-1
Content-Length: 356
# success!
Proxy Configuration Notes
If your existing configuration has a single configuration rule routing both ports 9092 and 443, you may need to split it into two simple rules:
- One for routing the pkc-*
and b*-pkc-*
FQDNs for port 9092
- One for routing the pkc-*
and pkac-*
FQDNs for port 443
Example nginx configuration:
https://docs.confluent.io/cloud/current/networking/ccloud-console-access.html#configure-a-proxy
load_module '/usr/lib64/nginx/modules/ngx_stream_module.so'; events {} stream { map $ssl_preread_server_name $targetBackend { default $ssl_preread_server_name; } server { listen 9092; proxy_connect_timeout 1s; proxy_timeout 7200s; # Run 'nslookup 127.0.0.53' on nginx host to verify resolver and check /var/log/nginx/error.log for any resolving issues using 127.0.0.53 resolver 127.0.0.53; # On lookup failure, reconfigure to use the cloud provider's resolver # resolver 169.254.169.253; # for AWS # resolver 168.63.129.16; # for Azure # resolver 169.254.169.254; # for Google proxy_pass $targetBackend:9092; ssl_preread on; } server { listen 443; proxy_connect_timeout 1s; proxy_timeout 7200s; resolver 127.0.0.53; proxy_pass $targetBackend:443; ssl_preread on; } log_format stream_routing '[$time_local] remote address $remote_addr' 'with SNI name "$ssl_preread_server_name" ' 'proxied to "$upstream_addr" ' '$protocol $status $bytes_sent $bytes_received ' '$session_time'; access_log /var/log/nginx/stream-access.log stream_routing; }
Example haproxy partial configuration:
(Notice how pkc-abcdef listens on ports 443 AND 9092. 443 is https and 9092 is kafka.)
frontend kafka bind :9092 acl snib0 req.ssl_sni -i b0-pkc-abcdef.us-central1.gcp.confluent.cloud acl snib1 req.ssl_sni -i b1-pkc-abcdef.us-central1.gcp.confluent.cloud acl snib2 req.ssl_sni -i b2-pkc-abcdef.us-central1.gcp.confluent.cloud acl snib3 req.ssl_sni -i b3-pkc-abcdef.us-central1.gcp.confluent.cloud acl snib4 req.ssl_sni -i b4-pkc-abcdef.us-central1.gcp.confluent.cloud acl snib5 req.ssl_sni -i b5-pkc-abcdef.us-central1.gcp.confluent.cloud acl snibs req.ssl_sni -i pkc-abcdef.us-central1.gcp.confluent.cloud use_backend b0 if snib0 use_backend b1 if snib1 use_backend b2 if snib2 use_backend b3 if snib3 use_backend b4 if snib4 use_backend b5 if snib5 use_backend bs if snibs frontend api bind :443 acl snipkac req.ssl_sni -i pkac-a1b2c3.us-central1.gcp.confluent.cloud acl snirest req.ssl_sni -i pkc-abcdef.us-central1.gcp.confluent.cloud use_backend pkac if snipkac use_backend rest if snirest backend b0 mode tcp server b0 b0-pkc-abcdef.us-central1.gcp.confluent.cloud:9092 check backend b1 mode tcp server b1 b1-pkc-abcdef.us-central1.gcp.confluent.cloud:9092 check backend b2 mode tcp server b2 b2-pkc-abcdef.us-central1.gcp.confluent.cloud:9092 check backend b3 mode tcp server b3 b3-pkc-abcdef.us-central1.gcp.confluent.cloud:9092 check backend b4 mode tcp server b4 b4-pkc-abcdef.us-central1.gcp.confluent.cloud:9092 check backend b5 mode tcp server b5 b5-pkc-abcdef.us-central1.gcp.confluent.cloud:9092 check backend bs mode tcp server bs pkc-abcdef.us-central1.gcp.confluent.cloud:9092 check backend pkac mode tcp server bs pkac-a1b2c3.us-central1.gcp.confluent.cloud:443 check backend rest mode tcp server bs pkc-abcdef.us-central1.gcp.confluent.cloud:443 check
Note: In some scenarios, HAProxy won't start if it cannot resolve endpoints on startup. You may encounter an error similar to: could not resolve address 'pkac-a1b2c3.us-central1.gcp.confluent.cloud'
. Should this occur, here are a few items to check:
- Check if the DNS record for the hostname in question has been deprecated (ie. "nslookup pkac-a1b2c3.us-central1.gcp.confluent.cloud"). If the DNS record is no longer present, you can safely remove the associated entry from your proxy configs.
- Leverage the
init-addr none
HAProxy config, which may avoid the issue altogether. Please consider your use case, and leverage this config if applicable.
SSH Tunnel Guidance
Customers using a SSH tunnel to forward port 443 to the bastion host will incur downtime during this migration. This is due to the inability to forward the same local port to two different hosts concurrently, which differs from the proxy setup that does not depend on port forwarding.
The semantics of this setup are below:
ssh -L local_port:destination_server_ip:remote_port ssh_server_hostname
Action required:
Customers would have previously setup a SSH tunnel based on the Kafka API endpoint format:
sudo ssh -L 443:<Kafka-API-Endpoint>:443 <username>@<instance-public-DNS>
When the UI changes to use the new endpoint format, requests will no longer go over this SSH tunnel. Customers can simply tear down and restart their SSH tunnel using the new endpoint format:
sudo ssh -L 443:<Kafka-REST-Endpoint>:443 <username>@<instance-public-DNS>
Port/Firewall Troubleshooting
With appropriate configurations in place (Proxy or SSH tunnel), traffic will use ports 443/22 on the proxy instance. These ports must be allowed in Firewalls and NSG’s for UI traffic to succeed.
DNS Troubleshooting
Appropriate DNS resolution is required for resolving Confluent endpoints to the appropriate IP address. From your local environment, this requires endpoint resolution to the proxy public IP address, allowing the proxy to forward traffic to Confluent over the private connection.
Your local DNS resolution may be updated via /etc/hosts updates, or corporate network DNS overrides. Below is an example /etc/hosts file:
## # Host Database # # localhost is used to configure the loopback interface # when the system is booting. Do not change this entry. ## 127.0.0.1 localhost 255.255.255.255 broadcasthost ::1 localhost 20.83.190.83 lkaclkc-qr98op.dom8w9on9wn.eastus.azure.confluent.cloud 20.83.190.83 lkc-qr98op.dom8w9on9wn.eastus.azure.confluent.cloud 1.2.3.4 pkac-a1b2c3.us-central1.gcp.confluent.cloud 1.2.3.5 pkc-abcdef.us-central1.gcp.confluent.cloud 1.2.3.6 b0-pkc-abcdef.us-central1.gcp.confluent.cloud 1.2.3.7 b1-pkc-abcdef.us-central1.gcp.confluent.cloud 1.2.3.8 b2-pkc-abcdef.us-central1.gcp.confluent.cloud 1.2.3.9 b3-pkc-abcdef.us-central1.gcp.confluent.cloud 1.2.3.10 b4-pkc-abcdef.us-central1.gcp.confluent.cloud 1.2.3.11 b5-pkc-abcdef.us-central1.gcp.confluent.cloud
Two entries have been added, resolving both Kafka API and Kafka REST endpoints to the public IP address of the proxy.