There no reason to retry around the execution of remote scripts. We've
already established a connection, so the only that could happen here is
to continually retry uploading or executing a script that can't succeed.
This also simplifies the streaming output from the command, which
doesn't need such explicit synchronization. Closing the output pipes is
sufficient to stop the copyOutput functions, and they don't close around
any values that are accessed again after the command executes.
Every provisioner that uses communicator implements its own retryFunc.
Take the remote-exec implementation (since it's the most complete) and
put it in the communicator package for each provisioner to use.
Add a public interface `communicator.Fatal`, which can wrap an error to
indicate a fatal error that should not be retried.
Add `host_key` and `bastion_host_key` fields to the ssh communicator
config for strict host key checking.
Both fields expect the contents of an openssh formated public key. This
key can either be the remote host's public key, or the public key of the
CA which signed the remote host certificate.
Support for signed certificates is limited, because the provisioner
usually connects to a remote host by ip address rather than hostname, so
the certificate would need to be signed appropriately. Connecting via
a hostname needs to currently be done through a secondary provisioner,
like one attached to a null_resource.
Currently the provisioner will fail if the `hab` user already exists on
the target system.
This adds a check to see if we need to create the user before trying to
add it.
Fixes#17159
Signed-off-by: Nolan Davidson <ndavidson@chef.io>
This change allows the Habitat supervisor service name to be
configurable. Currently it is hard coded to `hab-supervisor`.
Signed-off-by: Nolan Davidson <ndavidson@chef.io>
Moves the nested select statements for backend operations into a single
function. The only difference in this part was that apply called
PersistState, which should be harmless regardless of the type of
operation being run.
The error was being silently dropped before.
There is an interpolation error, because the plan is canceled before
some of the resources can be evaluated. There might be a better way to
handle this in the walk cancellation, but the behavior has not changed.
Make the plan and apply shutdown match implementation-wise
If the user wishes to interrupt the running operation, only the first
interrupt was communicated to the operation by canceling the provided
context. A second interrupt would start the shutdown process, but not
communicate this to the running operation. This order of event could
cause partial writes of state.
What would happen is that once the command returns, the plugin system
would stop the provider processes. Once the provider processes dies, all
pending Eval operations would return return with an error, and quickly
cause the operation to complete. Since the backend code didn't know that
the process was shutting down imminently, it would continue by
attempting to write out the last known state. Under the right
conditions, the process would exit part way through the writing of the
state file.
Add Stop and Cancel CancelFuncs to the RunningOperation, to allow it to
easily differentiate between the two signals. The backend will then be
able to detect a shutdown and abort more gracefully.
In order to ensure that the backend is not in the process of writing the
state out, the command will always attempt to wait for the process to
complete after cancellation.
The plan shutdown test often fail on slow CI hosts, becase the plan
completes befor the main thread can cancel it. Since attempting to make
the MockProvider concurrent proved too invasive for now, just slow the
test down a bit to help ensure Stop gets called.