There are actually a few different ways to get to this message.
1. Blank state — no previous terraform applied. Start with a cloud block.
1. Implicit local — start with no backend specified. This actually goes
through the same code execution path as the first scenario.
1. Explicit local — start with a backend local block that has been
applied, then change from the local backend to a cloud block. This
will recognize the state, and is a different path through the code in
the meta backend.
This commit handles the last case. The messaging has also been tweaked.
End to end test included as well.
When a user runs `terraform refresh` we give them an error message that
tells them to run `terraform apply -refresh-state`. We could just run
that command for them, though. That is what this PR does.
* determining source or destination to cloud
* handling single to single state migrations to cloud,
using a name strategy or a tags strategy
* Add end-to-end tests for state migration.
With the alternative block introduced in 7bf9b2c7b, this removes the
ability to explicitly declare the 'cloud' backend. The literal backend
interface is an implementation detail and no longer a user-level
concept when using Terraform Cloud.
This is a replacement declaration for using Terraform Cloud as a remote
backend, leaving the literal backend as an implementation detail and not
a user-level concept.
When running `terraform init` against a backend with multiple
workspaces, none of which are the currently indicated local workspace,
Terraform prompts the user to choose a workspace from the list. In
automation, using the `-input=false` argument should disable asking for
input, but previously would hang instead.
When an explicit backend is configured with a configuration which has
not yet been initialized, running `terraform init` performs a state
migration to fetch the remotely stored state in order to operate on it.
Like the previous bug introduced by the recent provider diagnostics
change, this code path was not correctly configured to enable init mode
for the backend, which resulted in a fatal error during init when the
cache dir is deleted.
Setting the `Init` backend option allows this code path to continue
without error when first initializing the backend for state migration.
The new e2e test fails without this change.
When migrating state to an existing Terraform Cloud workspace using the
remote backend, we check the remote version is compatible with the local
one by default.
This commit fixes two bugs in this code:
- If using the "name" strategy for the remote backend, the list of
destination workspaces is empty. This resulted in no version checking
of the remote workspace, and we fell back to the string equality
check.
- The user-specified CLI flag `-ignore-remote-version` was not being
applied for the state migration version checking.
The init command needs to initialize a backend, in order to access
state, in turn to derive provider requirements from state. The backend
initialization step requires building provider factories, which
previously would fail if a lockfile was present without a corresponding
local provider cache.
This commit ensures that in this situation only, errors with the
provider factories are temporarily ignored. This allows us to continue
to initialize the backend, fetch providers, and then report any errors
as necessary.
We test that a deleted provider cache results in an error when running
terraform plan, but previously did not test that running init (as
instructed) would resolve the issue. This (failing) e2e test adds that
step.
We introduced this experiment to gather feedback, and the feedback we saw
led to us deciding to do another round of design work before we move
forward with something to meet this use-case.
In addition to being experimental, this has only been included in alpha
releases so far, and so on both counts it is not protected by the
Terraform v1.0 Compatibility Promises.
The -lock and -lock-timeout flags were removed prior to the release of
1.0 as they were thought to have no effect. This is not true in the case
of state migrations when changing backends. This commit restores these
flags, and adds test coverage for locking during backend state
migration.
Also update the help output describing other boolean flags, showing the
argument as the user would type it rather than the default behavior.
There is a race between the MockSource and ShutdownCh which sometimes
causes this test to fail. Add a HangingSource implementation of Source
which hangs until the context is cancelled, so that there is always time
for a user-initiated shutdown to trigger the cancellation code path
under test.
Previously we would reject attempts to delete a workspace if its state
contained any resources at all, even if none of the resources had any
resource instance objects associated with it.
Nowadays there isn't any situation where the normal Terraform workflow
will leave behind resource husks, and so this isn't as problematic as it
might've been in the v0.12 era, but nonetheless what we actually care
about for this check is whether there might be any remote objects that
this state is tracking, and for that it's more precise to look for
non-nil resource instance objects, rather than whole resources.
This also includes some adjustments to our error messaging to give more
information about the problem and to use terminology more consistent with
how we currently talk about this situation in our documentation and
elsewhere in the UI.
We were also using the old State.HasResources method as part of some of
our tests. I considered preserving it to avoid changing the behavior of
those tests, but the new check seemed close enough to the intent of those
tests that it wasn't worth maintaining this method that wouldn't be used
in any main code anymore. I've therefore updated those tests to use
the new HasResourceInstanceObjects method instead.
We have various mechanisms that aim to ensure that the installed provider
plugins are consistent with the lock file and that the lock file is
consistent with the provider requirements, and we do have existing unit
tests for them, but all of those cases mock our fake out at least part of
the process and in the past that's caused us to miss usability
regressions, where we still catch the error but do so at the wrong layer
and thus generate error message lacking useful additional context.
Here we'll add some new end-to-end tests to supplement the existing unit
tests, making sure things work as expected when we assemble the system
together as we would in a release. These tests cover a number of different
ways in which the plugin selections can grow inconsistent.
These new tests all run only when we're in a context where we're allowed
to access the network, because they exercise the real plugin installer
codepath. We could technically build this to use a local filesystem mirror
or other such override to avoid that, but the point here is to make sure
we see the expected behavior in the main case, and so it's worth the
small additional cost of downloading the null provider from the real
registry.
In the original incarnation of Meta.providerFactories we were returning
into a Meta.contextOpts whose signature didn't allow it to return an
error directly, and so we had compromised by making the provider factory
functions themselves return errors once called.
Subsequent work made Meta.contextOpts need to return an error anyway, but
at the time we neglected to update our handling of the providerFactories
result, having it still defer the error handling until we finally
instantiate a provider.
Although that did ultimately get the expected result anyway, the error
ended up being reported from deep in the guts of a Terraform Core graph
walk, in whichever concurrently-visited graph node happened to try to
instantiate the plugin first. This meant that the exact phrasing of the
error message would vary between runs and the reporting codepath didn't
have enough context to given an actionable suggestion on how to proceed.
In this commit we make Meta.contextOpts pass through directly any error
that Meta.providerFactories produces, and then make Meta.providerFactories
produce a special error type so that Meta.Backend can ultimately return
a user-friendly diagnostic message containing a specific suggestion to
run "terraform init", along with a short explanation of what a provider
plugin is.
The reliance here on an implied contract between two functions that are
not directly connected in the callstack is non-ideal, and so hopefully
we'll revisit this further in future work on the overall architecture of
the CLI layer. To try to make this robust in the meantime though, I wrote
it to use the errors.As function to potentially unwrap a wrapped version
of our special error type, in case one of the intervening layers is
changed at some point to wrap the downstream error before returning it.
In historical versions of Terraform the responsibility to check this was
inside the terraform.NewContext function, along with various other
assorted concerns that made that function particularly complicated.
More recently, we reduced the responsibility of the "terraform" package
only to instantiating particular named plugins, assuming that its caller
is responsible for selecting appropriate versions of any providers that
_are_ external. However, until this commit we were just assuming that
"terraform init" had correctly selected appropriate plugins and recorded
them in the lock file, and so nothing was dealing with the problem of
ensuring that there haven't been any changes to the lock file or config
since the most recent "terraform init" which would cause us to need to
re-evaluate those decisions.
Part of the game here is to slightly extend the role of the dependency
locks object to also carry information about a subset of provider
addresses whose lock entries we're intentionally disregarding as part of
the various little edge-case features we have for overridding providers:
dev_overrides, "unmanaged providers", and the testing overrides in our
own unit tests. This is an in-memory-only annotation, never included in
the serialized plan files on disk.
I had originally intended to create a new package to encapsulate all of
this plugin-selection logic, including both the version constraint
checking here and also the handling of the provider factory functions, but
as an interim step I've just made version constraint consistency checks
the responsibility of the backend/local package, which means that we'll
always catch problems as part of preparing for local operations, while
not imposing these additional checks on commands that _don't_ run local
operations, such as "terraform apply" when in remote operations mode.
Previously the planfile.Create function had accumulated probably already
too many positional arguments, and I'm intending to add another one in
a subsequent commit and so this is preparation to make the callsites more
readable (subjectively) and make it clearer how we can extend this
function's arguments to include further components in a plan file.
There's no difference in observable functionality here. This is just
passing the same set of arguments in a slightly different way.
Historically the responsibility for making sure that all of the available
providers are of suitable versions and match the appropriate checksums has
been split rather inexplicably over multiple different layers, with some
of the checks happening as late as creating a terraform.Context.
We're gradually iterating towards making that all be handled in one place,
but in this step we're just cleaning up some old remnants from the
main "terraform" package, which is now no longer responsible for any
version or checksum verification and instead just assumes it's been
provided with suitable factory functions by its caller.
We do still have a pre-check here to make sure that we at least have a
factory function for each plugin the configuration seems to depend on,
because if we don't do that up front then it ends up getting caught
instead deep inside the Terraform runtime, often inside a concurrent
graph walk and thus it's not deterministic which codepath will happen to
catch it on a particular run.
As of this commit, this actually does leave some holes in our checks: the
command package is using the dependency lock file to make sure we have
exactly the provider packages we expect (exact versions and checksums),
which is the most crucial part, but we don't yet have any spot where
we make sure that the lock file is consistent with the current
configuration, and we are no longer preserving the provider checksums as
part of a saved plan.
Both of those will come in subsequent commits. While it's unusual to have
a series of commits that briefly subtracts functionality and then adds
back in equivalent functionality later, the lock file checking is the only
part that's crucial for security reasons, with everything else mainly just
being to give better feedback when folks seem to be using Terraform
incorrectly. The other bits are therefore mostly cosmetic and okay to be
absent briefly as we work towards a better design that is clearer about
where that responsibility belongs.
We must ensure that the terraform required_version is checked as early
as possible, so that new configuration constructs don't cause init to
fail without indicating the version is incompatible.
The loadConfig call before the earlyconfig parsing seems to be unneeded,
and we can delay that to de-tangle it from installing the modules which
may have their own constraints.
TODO: it seems that loadConfig should be able to handle returning the
version constraints in the same manner as loadSingleModule.