Previously there were only two planning modes: normal mode and destroy
mode. In that context it made sense for these to be distinguished only by
a boolean flag.
We're now getting ready to add our third mode, "refresh only". This
establishes the idea that planning can be done in one of a number of
mutually-exclusive "modes", which are related to but separate from the
various other options that serve as modifiers for the plan operation.
This commit only introduces the new plans.Mode type and replaces the
existing "destroy" flag with a variable of that type. This doesn't cause
any change in effective behavior because Terraform Core still supports
only NormalMode and DestroyMode, with NewContext rejecting an attempt to
create a RefreshMode context for now.
It is in retrospect a little odd that the "destroy" flag was part of
ContextOpts rather than just an argument to the Plan method, but
refactoring that would be too invasive a change for right now so we'll
leave this as a field of the context for now and save revisiting that for
another day.
To ensure that the apply command can determine whether an operation is
executed locally or remotely, we add an IsLocalOperations method on the
remote backend. This returns the internal forceLocal boolean.
We also update this flag after checking if the corresponding remote
workspace is in local operations mode or not. This ensures that we know
if an operation is running locally (entirely on the practitioner's
machine), pseudo-locally (on a Terraform Cloud worker), or remotely
(executing on a worker, rendering locally).
When migrating state to a new workspace, the version check would error
due to a 404 error on fetching the workspace record. This would result
in failed state migration.
Instead we should look specifically for a 404 error, and allow migration
to continue. If we're just about to create the workspace, there can't be
a version incompatibility problem.
When using the -lock-timeout option with the remote backend configured
in local operations mode, Terraform would fail to retry acquiring the
lock. This was caused by the lock error message having a missing Info
field, which the state manager requires to be present in order to
attempt retries.
This commit extracts the remaining UI logic from the local backend,
and removes access to the direct CLI output. This is replaced with an
instance of a `views.Operation` interface, which codifies the current
requirements for the local backend to interact with the user.
The exception to this at present is interactivity: approving a plan
still depends on the `UIIn` field for the backend. This is out of scope
for this commit and can be revisited separately, at which time the
`UIOut` field can also be removed.
Changes in support of this:
- Some instances of direct error output have been replaced with
diagnostics, most notably in the emergency state backup handler. This
requires reformatting the error messages to allow the diagnostic
renderer to line-wrap them;
- The "in-automation" logic has moved out of the backend and into the
view implementation;
- The plan, apply, refresh, and import commands instantiate a view and
set it on the `backend.Operation` struct, as these are the only code
paths which call the `local.Operation()` method that requires it;
- The show command requires the plan rendering code which is now in the
views package, so there is a stub implementation of a `views.Show`
interface there.
Other refactoring work in support of migrating these commands to the
common views code structure will come in follow-up PRs, at which point
we will be able to remove the UI instances from the unit tests for those
commands.
The clistate package includes a Locker interface which provides a simple
way for the local backend to lock and unlock state, while providing
feedback to the user if there is a delay while waiting for the lock.
Prior to this commit, the backend was responsible for initializing the
Locker, passing through direct access to the cli.Ui instance.
This structure prevented commands from implementing different
implementations of the state locker UI. In this commit, we:
- Move the responsibility of creating the appropriate Locker to the
source of the Operation;
- Add the ability to set the context for a Locker via a WithContext
method;
- Replace the Locker's cli.Ui and Colorize members with a StateLocker
view;
- Implement views.StateLocker for human-readable UI;
- Update the Locker interface to return detailed diagnostics instead of
errors, reducing its direct interactions with UI;
- Add a Timeout() method on Locker to allow the remote backend to
continue to misuse the -lock-timeout flag to cancel pending runs.
When an Operation is created, the StateLocker field must now be
populated with an implementation of Locker. For situations where locking
is disabled, this can be a no-op locker.
This change has no significant effect on the operation of Terraform,
with the exception of slightly different formatting of errors when state
locking or unlocking fails.
The enhanced backends (local and remote) need to be able to render
diagnostics during operations. Prior to this commit, this functionality
was supported with a per-backend `ShowDiagnostics` function pointer.
In order to allow users of these backends to control how diagnostics are
rendered, this commit moves that function pointer to the `Operation`
type. This means that a diagnostic renderer is configured for each
operation, rather than once per backend initialization.
Some secondary consequences of this change:
- The `ReportResult` method on the backend is now moved to the
`Operation` type, as it needs to access the `ShowDiagnostics` callback
(and nothing else from the backend);
- Tests which assumed that diagnostics would be written to the backend's
`cli.Ui` instance are migrated to using a new record/playback diags
helper function;
- Apply, plan, and refresh commands now pass a pointer to the `Meta`
struct's `showDiagnostics` method.
This commit should not change how Terraform works, and is refactoring in
preparation for more changes which move UI code out of the backend.
This dramatically simplifies the logic around auto-approve, which is
nice.
Also add test coverage for the manual approve step, for both apply and
destroy, answering both yes and no.
CountHook is an implementation of terraform.Hook which is used to
calculate how many resources were added, changed, or destroyed during an
apply. This hook was previously injected in the local backend code,
which means that the apply command code has no access to these counts.
This commit moves the CountHook code into the command package, and
removes an unused instance of the hook in the plan code path. The goal
here is moving UI code into the command package.
If the remote backend is connected to a Terraform Cloud workspace in
local operations mode, we disable the version check, as the remote
Terraform version is meaningless.
Terraform remote version conflicts are not a concern for operations. We
are in one of three states:
- Running remotely, in which case the local version is irrelevant;
- Workspace configured for local operations, in which case the remote
version is meaningless;
- Forcing local operations with a remote backend, which should only
happen in the Terraform Cloud worker, in which case the Terraform
versions by definition match.
This commit therefore disables the version check for operations (plan
and apply), which has the consequence of disabling it in Terraform Cloud
and Enterprise runs. In turn this enables Terraform Enterprise runs with
bundles which have a version that doesn't exactly match the bundled
Terraform version.
Terraform Cloud/Enterprise support a pseudo-version of "latest" for the
configured workspace Terraform version. If this is chosen, we abandon
the attempt to verify the versions are compatible, as the meaning of
"latest" cannot be predicted.
This affects both the StateMgr check (used for commands which execute
remotely) and the full version check (for local commands).
When using the enhanced remote backend, a subset of all Terraform
operations are supported. Of these, only plan and apply can be executed
on the remote infrastructure (e.g. Terraform Cloud). Other operations
run locally and use the remote backend for state storage.
This causes problems when the local version of Terraform does not match
the configured version from the remote workspace. If the two versions
are incompatible, an `import` or `state mv` operation can cause the
remote workspace to be unusable until a manual fix is applied.
To prevent this from happening accidentally, this commit introduces a
check that the local Terraform version and the configured remote
workspace Terraform version are compatible. This check is skipped for
commands which do not write state, and can also be disabled by the use
of a new command-line flag, `-ignore-remote-version`.
Terraform version compatibility is defined as:
- For all releases before 0.14.0, local must exactly equal remote, as
two different versions cannot share state;
- 0.14.0 to 1.0.x are compatible, as we will not change the state
version number until at least Terraform 1.1.0;
- Versions after 1.1.0 must have the same major and minor versions, as
we will not change the state version number in a patch release.
If the two versions are incompatible, a diagnostic is displayed,
advising that the error can be suppressed with `-ignore-remote-version`.
When this flag is used, the diagnostic is still displayed, but as a
warning instead of an error.
Commands which will not write state can assert this fact by calling the
helper `meta.ignoreRemoteBackendVersionConflict`, which will disable the
checks. Those which can write state should instead call the helper
`meta.remoteBackendVersionCheck`, which will return diagnostics for
display.
In addition to these explicit paths for managing the version check, we
have an implicit check in the remote backend's state manager
initialization method. Both of the above helpers will disable this
check. This fallback is in place to ensure that future code paths which
access state cannot accidentally skip the remote version check.
The remote backend tests spent most of their execution time sleeping in
various polling and backoff waits. This is unnecessary when testing
against a mock server, so reduce all of these delays when under test to
much lower values.
Only one remaining test has an artificial delay: verifying the discovery
of services against an unknown hostname. This times out at DNS
resolution, which is more difficult to fix than seems worth it at this
time.
Use a single log writer instance for all std library logging.
Setup the std log writer in the logging package, and remove boilerplate
from test packages.
A cost estimation error does not actually stop a run, so the run was continuing in the background after the cli exits, causing confusion. This change matches the UI behavior.
Most of the state package has been deprecated by the states package.
This PR replaces all the references to the old state package that
can be done simply - the low-hanging fruit.
* states: move state.Locker to statemgr
The state.Locker interface was a wrapper around a statemgr.Full, so
moving this was relatively straightforward.
* command: remove unnecessary use of state package for writing local terraform state files
* move state.LocalState into terraform package
state.LocalState is responsible for managing terraform.States, so it
made sense (to me) to move it into the terraform package.
* slight change of heart: move state.LocalState into clistate instead of
terraform
* unlock the state if Context() has an error, exactly as backend/remote does today
* terraform console and terraform import will exit before unlocking state in case of error in Context()
* responsibility for unlocking state in the local backend is pushed down the stack, out of backend.go and into each individual state operation
* add tests confirming that state is not locked after apply and plan
* backend/local: add checks that the state is unlocked after operations
This adds tests to plan, apply and refresh which validate that the state
is unlocked after all operations, regardless of exit status. I've also
added specific tests that force Context() to fail during each operation
to verify that locking behavior specifically.
* command/console: return in case of errors before trying to unlock remote
state
The remote backend `Context` would exit without an active lock if there
was an error, while the local backend `Context` exited *with* a lock. This
caused a problem in `terraform console`, which would call unlock
regardless of error status.
This commit makes the local and remote backend consistently unlock the
state incase of error, and updates terraform console to check for errors
before trying to unlock the state.
* adding tests for remote and local backends
* backend/remote: do not panic if PrepareConfig or Configure receive null
objects
If a user cancels (ctrl-c) terraform init while it is requesting missing
configuration options for the remote backend, the PrepareConfig and
Configure functions would receive a null cty.Value which would result in
panics. This PR adds a check for null objects to the two functions in
question.
Fixes#23992
The remote server might choose to skip running cost estimation for a
targeted plan, in which case we'll show a note about it in the UI and then
move on, rather than returning an "invalid status" error.
This new status isn't yet available in the go-tfe library as a constant,
so for now we have the string directly in our switch statement. This is
a pragmatic way to expedite getting the "critical path" of this feature
in place without blocking on changes to ancillary codebases. A subsequent
commit should switch this over to tfe.CostEstimateSkippedDueToTargeting
once that's available in a go-tfe release.
Previously we did not allow -target to be used with the remote backend
because there was no way to send the targets to Terraform Cloud/Enterprise
via the API.
There is now an attribute in the request for creating a plan that allows
us to send target addresses, so we'll remove that restriction and copy
the given target addresses into the API request.
This includes a new TargetAddrs field on both Run and RunCreateOptions
which we'll use to send resource addresses that were specified using
-target on the CLI command line when using the remote backend.
There were some unrelated upstream breaking changes compared to the last
version we had vendored, so this commit also includes some changes to the
backend/remote package to work with this new API, which now requires the
remote backend to be aware of the remote system's opaque workspace id.
Both differing serials and lineage protections should be bypassed
with the -force flag (in addition to resources).
Compared to other backends we aren’t just shipping over the state
bytes in a simple payload during the persistence phase of the push
command and the force flag added to the Go TFE client needs to be
specified at that time.
To prevent changing every method signature of PersistState of the
remote client I added an optional interface that provides a hook
to flag the Client as operating in a force push context. Changing
the method signature would be more explicit at the cost of not
being used anywhere else currently or the optional interface pattern
could be applied to the state itself so it could be upgraded to
support PersistState(force bool) only when needed.
Prior to this only the resources of the state were checked for
changes not the lineage or the serial. To bring this in line with
documented behavior noted above those attributes also have a “read”
counterpart just like state has. These are now checked along with
state to determine if the state as a whole is unchanged.
Tests were altered to table driven test format and testing was
expanded to include WriteStateForMigration and its interaction
with a ClientForcePusher type.
* backend/remote: Filter environment variables when loading context
Following up on #23122, the remote system (Terraform Cloud or
Enterprise) serves environment and Terraform variables using a single
type of object. We only should load Terraform variables into the
Terraform context.
Fixes https://github.com/hashicorp/terraform/issues/23283.
For remote operations, the remote system (Terraform Cloud or Enterprise)
writes the stored variable values into a .tfvars file before running the
remote copy of Terraform CLI.
By contrast, for operations that only run locally (like
"terraform import"), we fetch the stored variable values from the remote
API and add them into the set of available variables directly as part
of creating the local execution context.
Previously in the local-only case we were assuming that all stored
variables are strings, which isn't true: the Terraform Cloud/Enterprise UI
allows users to specify that a particular variable is given as an HCL
expression, in which case the correct behavior is to parse and evaluate
the expression to obtain the final value.
This also addresses a related issue whereby previously we were forcing
all sensitive values to be represented as a special string "<sensitive>".
That leads to type checking errors for any variable specified as having
a type other than string, so instead here we use an unknown value as a
placeholder so that type checking can pass.
Unpopulated sensitive values may cause errors downstream though, so we'll
also produce a warning for each of them to let the user know that those
variables are not available for local-only operations. It's a warning
rather than an error so that operations that don't rely on known values
for those variables can potentially complete successfully.
This can potentially produce errors in situations that would've been
silently ignored before: if a remote variable is marked as being HCL
syntax but is not valid HCL then it will now fail parsing at this early
stage, whereas previously it would've just passed through as a string
and failed only if the operation tried to interpret it as a non-string.
However, in situations like these the remote operations like
"terraform plan" would already have been failing with an equivalent
error message anyway, so it's unlikely that any existing workspace that
is being used for routine operations would have such a broken
configuration.