Tips on Troubleshooting Cloud Application Manager issues
Check the proposed solutions to common issues before you contact Cloud Application Manager support in the event of a problem.
In this article
- Instance Unreachable During an Operation
- Instance is Stuck Deploying
- Instance is Still Terminating
- Instance Hangs at the Install Cloud Application Manager Agent Step
- Catalog Box deployment not working
- Resource sharing problems
- Instance fails to register in Cloud Application Manager
Instance Unreachable During an Operation
Cause
When you trigger a lifecycle operation on an instance, it goes into a processing state and does not finish. This can be caused due to the agent not having direct connectivity with the Cloud Application Manager backend server, the proxy configuration is wrong or the proxy is not responding (if the agent is configured to use it), or the agent is not running properly or has been stopped. This problem can affect instances launched through the Cloud Application Manager web interface, the API, or directly using the agent command.
Solution 1
Make sure the communication from CAM Agent to CAM servers works correctly.
-
Connect to the instance by SSH or RDP.
-
Run a command to check basic connectivity, such as the following or an equivalent one:
telnet cam.ctl.io 443
If the above command does not get a reply, please take the corrective actions inside the VM, routers and firewalls to allow unrestricted communication between agent and CAM servers.
If the connectivity seems to be working as expected and the problem persists, connect to the instance over SSH or RDP and grab a copy of the agent /var/log/elasticbox/elasticbox-agent.log
in the case of Linux or ProgramData\ElasticBox\elasticbox-agent.log
in the case of Windows (please note that the folder is hidden in Windows and the exact path must be entered to navigate to the folder). Once the log file is collected please attach it when Contacting support
Solution 2
Check agent proxy settings (if configured to use one) at /usr/elasticbox/elasticbox.conf
in linux or ProgramFiles\ElasticBox\Agent\elasticbox.conf
in windows. If proxy is used for agent, make sure its working correctly and provide stable connectivity to CAM servers.
Solution 3
Install the agent on the instance to bring it online. Then re-run the lifecycle operation.
-
Connect to the instance by SSH or RDP.
-
Install the agent. The command uses the token of the older agent to connect to the instance.
Linux instances deployed from the Cloud Application Manager cloud service:
curl -sSL https://cam.ctl.io | sudo bash
Windows instances deployed from the Cloud Application Manager cloud service (run the command as a PowerShell administrator):
[Net.ServicePointManager]::SecurityProtocol = [Net.ServicePointManager]::SecurityProtocol -bor [Net.SecurityProtocolType]::Tls12 (New-Object Net.WebClient).DownloadString("https://cam.ctl.io") | iex
If the problem persists connect to the instance over SSH or RDP and grab a copy of the agent /var/log/elasticbox/elasticbox-agent.log
in the case of Linux or ProgramData\ElasticBox\elasticbox-agent.log
in the case of Windows (please note that the folder is hidden in Windows and the exact path must be entered to navigate to the folder). Once the log file is collected, please attach it when Contacting support
Instance is Stuck Deploying
Cause 1
An instance can’t update its state because the Cloud Application Manager agent can’t connect to Cloud Application Manager. Poor agent network connectivity can cause this issue.
Solution 1
-
Restart the agent on the Linux or Windows instance:
Linux:
sudo /usr/elasticbox/elasticbox restart
Windows (run in command prompt):
net stop elasticbox net start elasticbox
-
In Cloud Application Manager, open the lifecycle editor of the instance and click Reinstall. The instance should reflect the proper status.
-
If problem persists, check connectivity between the Instance and CAM Servers.
If the connectivity seems to be working as expected and the problem persists, connect to the instance over SSH or RDP and grab a copy of the agent /var/log/elasticbox/elasticbox-agent.log
in the case of Linux or ProgramData\ElasticBox\elasticbox-agent.log
in the case of Windows (please note that the folder is hidden in Windows and the exact path must be entered to navigate to the folder). Once the log file is collected please attach it when Contacting support
Solution 2
Check agent proxy settings (if configured to use one) at /usr/elasticbox/elasticbox.conf
in linux or ProgramFiles\ElasticBox\Agent\elasticbox.conf
in windows. If proxy is used for agent, make sure its working correctly and provide stable connectivity to CAM servers.
Solution 3
Make sure the communication from CAM Agent to CAM servers works correctly.
-
Connect to the instance by SSH or RDP.
-
Run a command to check basic connectivity, such as the following or an equivalent one:
telnet cam.ctl.io 443
-
Take the corrective actions inside the VM, routers and firewalls to allow unrestricted communication between agent and CAM servers.
-
In Cloud Application Manager, open the lifecycle editor of the instance and click Reinstall. The instance should reflect the proper status.
If the connectivity seems to be working as expected and the problem persists, connect to the instance over SSH or RDP and grab a copy of the agent /var/log/elasticbox/elasticbox-agent.log
in the case of Linux or ProgramData\ElasticBox\elasticbox-agent.log
in the case of Windows (please note that the folder is hidden in Windows and the exact path must be entered to navigate to the folder). Once the log file is collected please attach it when Contacting support
Cause 2
When a box script terminates with 100 exit code, typically because of apt-get failures, the Cloud Application Manager agent goes into sleep mode waiting for the machine or service to reboot.
Solution
-
Open the instance page in CAM to review the deployment logs
-
Identify which script is exiting with code 100.
-
Open lifecycle editor
-
Edit the previously identified and add the following line just after the command that exited with 100.
[[ "$?" -eq "100" ]] && exit 1
-
Connect to Instance over SSH or RDP
-
Restart the agent.
Linux:
sudo /usr/elasticbox/elasticbox restart
Windows (run in command prompt):
net stop elasticbox net start elasticbox
Instance is Still Terminating
Cause
Several concurrent terminate requests within seconds of each other can cause a race condition and keep the instance in running state.
Solution
From CAM instance page, Force the instance to terminate, or alternatively, in the lifecycle editor of the instance, click Force Terminate.
Instance Hangs at the Install Cloud Application Manager Agent Step
Cause
Something causes the agent to hang even though it’s running on the instance.
Solution
-
Log in to the instance and kill the agent.
-
Redeploy the instance from the lifecycle editor in Cloud Application Manager. The agent should start deploying.
If the problem persists connect to the instance over SSH or RDP and grab a copy of the agent /var/log/elasticbox/elasticbox-agent.log
in the case of Linux or ProgramData\ElasticBox\elasticbox-agent.log
in the case of Windows (please note that the folder is hidden in Windows and the exact path must be entered to navigate to the folder). Once the log file is collected, please attach it when Contacting support
Catalog Box deployment not working
Cause 1
The box requirements are not met by the deployment policy box.
Solution
Please check out the Box readme documentation and the box requirements and make sure your provider and deployment policy box meets the minimum requirements (OS type/version, cpu, RAM etc).
Cause 2
The Catalog public boxes are provided as templates for quickstart deploying commonly used software packages. We do our best effort to keep them up to date but from time to time, software updates, deprecated packages/modules/libraries or just plain broken installation scripts might break them or render them unusuable.
Solution
We encourage you to Contact support to alert us of the broken box. In the meantime, you can define the install in a box using bash commands like apt-get, wget, cURL, and more.
Resource sharing problems
When trying to share a resource (Provider, Box, Instance) the desired user or team can't be found in the users list or the ownership of the resource cant be transfered to another user/team.
Cause 1
Resources can only be shared with other users or teams that belong to the same organization or have their organizations federated.
Solution
Make sure the resource owner and the desired user to share it with belong to the same organization or request the organization administrator to request the federation with the other organization though Support.
Cause 2
Providers and Deployment policy boxes can only be shared with other users or teams that belong to the same cost center. Each user/team have an assigned costcenter that cannot be changed.
Solution 1
Make sure the resource owner and the desired user to share it with belong to the same costcenter. or request the organization administrator to request the federation with the other organization though Support.
Solution 2
An organization administrator can edit the Costcenter and add any user in the organization (or federeated organizations) as a Costcenter administrator. Once the user is a costcenter administrator, any resource owned by any other user/team of the same costcenter could be shared with this user.
Cause 3
Providers and Deployment policy boxes ownership can only be transfered to other users or teams that belong to the same cost center. Each user/team have an assigned costcenter that cannot be changed.
Solution
Make sure the resource owner and the desired user to transfer ownership to belong to the same costcenter.
Instance fails to register in Cloud Application Manager
Cause
An instance registration (import) is waiting for the CAM agent to install, but the agent installation never seems to be completed.
Solution 1
Make sure the communication from the instance to be register to CAM servers works correctly.
-
Connect to the instance by SSH or RDP.
-
Run a command to check basic connectivity, such as the following or an equivalent one:
telnet cam.ctl.io 443
If the above command does not get a reply, please:
-
Take the corrective actions inside the VM, routers and firewalls to allow unrestricted communication between the instance and CAM servers.
-
In Cloud Application Manager, open the instance details page and click Retry import if available, or Cancel import and start the import process again. The instance should now finish the registration process successfully.
If the connectivity seems to be working as expected and the problem persists, connect to the instance over SSH or RDP and grab a copy of the agent /var/log/elasticbox/elasticbox-agent.log
in the case of Linux or ProgramData\ElasticBox\elasticbox-agent.log
in the case of Windows (please note that the folder is hidden in Windows and the exact path must be entered to navigate to the folder). Once the log file is collected please attach it when Contacting support
Solution 2
The CAM agent might be stopped and not able to complete the registration process due to the instance SSL stack not supporting TLS 1.2 at minimum. This might happen in old Windows machines using a .Net Framework version earlier than 4.5 (in the 4.5 version you have to opt-in to use it, while in versions 4.6 and above it is the default).
Check if ElasticBox bootstrap logs (C:\program files\elastic Box
in Windows) have something such as:
2019-09-05 17:45:07Z Exception setting "SecurityProtocol": "Cannot convert null to type "System.Net.SecurityProtocolType" due to invalid enumeration values. Specify one of the following enumeration values and try again. The possible enumeration values are "Ssl3, Tls"."
This message confirms that the TLS 1.2 protocol required is not supported by the environment executing the script. Upgrade the environment to a version that supports TLS 1.2 to address this issue (for example, for older Windows environments this might require to upgrade the .Net Framework version that the script uses).
If the TLS 1.2 support seems to be available in the instance and the problem persists, connect to the instance over SSH or RDP and grab a copy of the agent /var/log/elasticbox/elasticbox-agent.log
in the case of Linux or ProgramData\ElasticBox\elasticbox-agent.log
in the case of Windows (please note that the folder is hidden in Windows and the exact path must be entered to navigate to the folder). Once the log file is collected please attach it when Contacting support