The previous conservative guarantee that we would not make backwards
incompatible changes to the state file format until at least Terraform
1.1 can now be extended. Terraform 0.14 through 1.1 will be able to
interoperably use state files, so we can update the remote backend
version compatibility check accordingly.
Because our validation rules depend on some dynamic results produced by
actually running the plan, we deal with moves in a "backwards" order where
we try to apply them first -- ignoring anything strange we might find --
and then validate the original statements only after planning.
An unfortunate consequence of that approach is that when the move
statements are invalid it's likely that move execution will not fully
complete, and so the generated plan is likely to be incorrect and might
well include errors resulting from the unresolved moves.
To mitigate that, here we let any move validation errors supersede all
other diagnostics that the plan phase might've generated, in the hope that
it'll help the user focus on fixing the incorrect move statements without
creating confusing by reporting errors that only appeared as a quick of
how Terraform worked around the invalid move statements earlier.
In most cases Terraform will be able to automatically fully resolve all
of the pending move statements before creating a plan, but there are some
edge cases where we can end up wanting to move one object to a location
where another object is already declared.
One relatively-obvious example is if someone uses "terraform state mv" in
order to create a set of resource instance bindings that could never have
arising in normal Terraform use.
A less obvious example arises from the interactions between moves at
different levels of granularity. If we are both moving a module to a new
address and moving a resource into an instance of the new module at the
same time, the old module might well have already had a resource of the
same name and so the resource move will be unresolvable.
In these situations Terraform will move the objects as far as possible,
but because it's never valid for a move "from" address to still be
declared in the configuration Terraform will inevitably always plan to
destroy the objects that didn't find a final home. To give some additional
explanation for that result, here we'll add a warning which describes
what happened.
This is not a particularly actionable warning because we don't really
have enough information to guess what the user intended, but we do at
least prompt that they might be able to use the "terraform state" family
of subcommands to repair the ambiguous situation before planning, if they
want a different result than what Terraform proposed.
The core runtime is now able to specify a reason for some situations when
Terraform plans to delete a resource instance.
This commit makes that information visible in the human-oriented UI. A
previous commit already made the underlying data informing these new hints
visible as part of the machine-oriented (JSON) plan output.
This also removes the bold formatting from the existing "has moved to"
hints, because subjectively it seemed like the result was emphasizing too
many parts of the output and thus somewhat defeating the benefit of the
emphasis in trying to create additional visual hierarchy for sighted users
running Terraform in a terminal. Now only the first line containing the
main action statement will be in bold, and all of the parenthesized
follow-up notes will be unformatted.
There are a few different reasons why a resource instance tracked in the
prior state might be considered an "orphan", but previously we reported
them all identically in the planned changes.
In order to help users understand the reason for a surprising planned
delete, we'll now try to specify an additional reason for the planned
deletion, covering all of the main reasons why that could happen.
This commit only introduces the new detail to the plans.Changes result,
though it also incidentally exposes it as part of the JSON plan result
in order to keep that working without returning errors in these new
cases. We'll expose this information in the human-oriented UI output in
a subsequent commit.
Our previous rule for implicitly moving from IntKey(0) to NoKey would
apply that move even when the current resource configuration uses
for_each, because we were only considering whether "count" were set.
Previously this was relatively harmless because the resource instance in
question would end up planned for deletion anyway: neither an IntKey nor
a NoKey are valid keys for for_each.
Now that we're going to be announcing these moves explicitly in the UI,
it would be confusing to see Terraform report that IntKey moved to NoKey
in a situation where the config changed from count to for_each, so to
address that we'll only generate the implied statement if neither
repetition argument is set.
When planning in refresh-only mode, we must not remove orphaned
resources due to changed count or for_each values from the planned
state. This was previously happening because we failed to pass through
the plan's skip-plan-changes flag to the instance orphan node.
When initializing a backend, if the currently selected workspace does
not exist, the user is prompted to select from the list of workspaces
the backend provides.
Instead, we should automatically select the only workspace available
_if_ that's all that's there.
Although with being a nice bit of polish, this enables future
improvments with Terraform Cloud in potentially removing the implicit
depenency on always using the 'default' workspace when the current
configuration is mapped to a single TFC workspace.
We can also rule out some attribute types as indicating something other
than the legacy SDK.
- Tuple types were not generated at all.
- There were no single objects types, the convention was to use a block
list or set of length 1.
- Maps of objects were not possible to generate, since named blocks were
not implemented.
- Nested collections were not supported, but when they were generated they
would have primitive types.
If structural types are being used, we can be assured that the legacy
SDK SchemaConfigModeAttr is not being used, and the fixup is not needed.
This prevents inadvertent mapping of blocks to structural attributes,
and allows us to skip the fixup overhead when possible.
When we originally stubbed ApplyMoves we didn't know yet how exactly we'd
be using the result, so we made it a double-indexed map allowing looking
up moves in both directions.
However, in practice we only actually need to look up old addresses by new
addresses, and so this commit first removes the double indexing so that
each move is only represented by one element in the map.
We also need to describe situations where a move was blocked, because in
a future commit we'll generate some warnings in those cases. Therefore
ApplyMoves now returns a MoveResults object which contains both a map of
changes and a map of blocks. The map of blocks isn't used yet as of this
commit, but we'll use it in a later commit to produce warnings within
the "terraform" package.
The whole point of UniqueKey is to deal with the fact that we have some
distinct address types which have an identical string representation, but
unfortunately that fact caused us to not notice that we'd incorrectly
made AbsResource.UniqueKey return a no-key instance UniqueKey instead of
its own distinct unique key type.
Remove answers from testInputResponse as they are given, and raise an
error during cleanup if any answers remain unused.
This enables tests to ensure that the expected mock answers are actually
used in a test; previously, an entire branch of code including an input
sequence could be omitted and the test(s) would not fail.
The only test that had unused answers in this map is one leftover from
legacy state migrations, a prompt that was removed in
7c93b2e5e6
Add previous address information to the `planned_change` and
`resource_drift` messages for the streaming JSON UI output of plan and
apply operations.
Here we also add a "move" action value to the `change` object of these
messages, to represent a move-only operation.
As part of this work we also simplify this code to use the plan's
DriftedResources values instead of recomputing the drift from state.
Configuration-driven moves are represented in the plan file by setting
the resource's `PrevRunAddr` to a different value than its `Addr`. For
JSON plan output, we here add a new field to resource changes,
`previous_address`, which is present and non-empty only if the resource
is planned to be moved.
Like the CLI UI, refresh-only plans will include move-only changes in
the resource drift JSON output. In normal plan mode, these are elided to
avoid redundancy with planned changes.
Going back a long time we've had a special magic behavior which tries to
recognize a situation where a module author either added or removed the
"count" argument from a resource that already has instances, and to
silently rename the zeroth or no-key instance so that we don't plan to
destroy and recreate the associated object.
Now we have a more general idea of "move statements", and specifically
the idea of "implied" move statements which replicates the same heuristic
we used to use for this behavior, we can treat this magic renaming rule as
just another "move statement", special only in that Terraform generates it
automatically rather than it being written out explicitly in the
configuration.
In return for wiring that in, we can now remove altogether the
NodeCountBoundary graph node type and its associated graph transformer,
CountBoundaryTransformer. We handle moves as a preprocessing step before
building the plan graph, so we no longer need to include any special nodes
in the graph to deal with that situation.
The test updates here are mainly for the graph builders themselves, to
acknowledge that indeed we're no longer inserting the NodeCountBoundary
vertices. The vertices that NodeCountBoundary previously depended on now
become dependencies of the special "root" vertex, although in many cases
here we don't see that explicitly because of the transitive reduction
algorithm, which notices when there's already an equivalent indirect
dependency chain and removes the redundant edge.
We already have plenty of test coverage for these "count boundary" cases
in the context tests whose names start with TestContext2Plan_count and
TestContext2Apply_resourceCount, all of which continued to pass here
without any modification and so are not visible in the diff. The test
functions particularly relevant to this situation are:
- TestContext2Plan_countIncreaseFromNotSet
- TestContext2Plan_countDecreaseToOne
- TestContext2Plan_countOneIndex
- TestContext2Apply_countDecreaseToOneCorrupted
The last of those in particular deals with the situation where we have
both a no-key instance _and_ a zero-key instance in the prior state, which
is interesting here because to exercises an intentional interaction
between refactoring.ImpliedMoveStatements and refactoring.ApplyMoves,
where we intentionally generate an implied move statement that produces
a collision and then expect ApplyMoves to deal with it in the same way as
it would deal with all other collisions, and thus ensure we handle both
the explicit and implied collisions in the same way.
This does affect some UI-level tests, because a nice side-effect of this
new treatment of this old feature is that we can now report explicitly
in the UI that we're assigning new addresses to these objects, whereas
before we just said nothing and hoped the user would just guess what had
happened and why they therefore weren't seeing a diff.
The backend/local plan tests actually had a pre-existing bug where they
were using a state with a different instance key than the config called
for but getting away with it because we'd previously silently fix it up.
That's still fixed up, but now done with an explicit mention in the UI
and so I made the state consistent with the configuration here so that the
tests would be able to recognize _real_ differences where present, as
opposed to the errant difference caused by that inconsistency.
Per our rule that the content of the state can never make a move statement
invalid, our behavior for two objects trying to occupy the same address
will be to just ignore that and let the object already at the address
take priority.
For the moment this is silent from an end-user perspective and appears
only in our internal logs. However, I'm hoping that our future planned
adjustment to the interface of this function will include some way to
allow reporting these collisions in some end-user-visible way, either as
a separate warning per collision or as a single warning that collects
together all of the collisions into a single message somehow.
This situation can arise both because the previous run state already
contained an object at the target address of a move and because more than
one move ends up trying to target the same location. In the latter case,
which one "wins" is decided by our depth-first traversal order, which is
in turn derived from our chaining and nesting rules and is therefore
arbitrary but deterministic.
This new function complements the existing function FindMoveStatements
by potentially generating additional "implied" move statements that aren't
written explicit in the configuration but that we'll infer by comparing
the configuration and te previous run state.
The goal here is to infer only enough to replicate the effect of the
"count boundary fixup" graph node (terraform.NodeCountBoundary) that we
currently use to deal with this concern of preserving the zero-instance
when switching between "count" and not "count".
This is just dead code for now. A subsequent commit will introduce this
into the "terraform" package while also removing
terraform.NodeCountBoundary, thus achieving the same effect as before but
in a way that'll get reported in the UI as a move, using the same language
that we'd use for an explicit move statement.
This is similar to the existing SelectsModule method, returning true if
the reciever selects either a particular resource as a whole or any of the
instances of that resource.
We don't need this test in the normal case, but we will need it in a
subsequent commit when we'll be possibly generating _implied_ move
statements between instances of resources, but only if there aren't
explicit move statements mentioning those resources already.
The set of drifted resources now includes move-only changes, where the
object value is identical but a move has been executed. In normal
operation, we previousl displayed these moves twice: once as part of
drift output, and once as part of planned changes.
As of this commit we omit move-only changes from drift display, except
for refresh-only plans. This fixes the redundant output.
Previously, drifted resources included only updates and deletes. To
correctly display the full changes which would result as part of a
refresh-only apply, the drifted resources must also include move-only
changes.
Colorizing the result of an interpolated string can result in
incorrect output, if the values used to generate the string happen to
include color codes such as `[red]` or `[bold]`. Instead we should
always colorize the format string before calling functions like
`Sprintf`. This commit fixes all instances in this file.
Rather than delaying resource drift detection until it is ready to be
presented, here we perform that computation after the plan walk has
completed. The resulting drift is represented like planned resource
changes, using a slice of ResourceInstanceChangeSrc values.
Because "moved" blocks produce changes that span across more than one
resource instance address at the same time, we need to take extra care
with them during planning.
The -target option allows for restricting Terraform's attention only to
a subset of resources when planning, as an escape hatch to recover from
bugs and mistakes.
However, we need to avoid any situation where only one "side" of a move
would be considered in a particular plan, because that'd create a new
situation that would be otherwise unreachable and would be difficult to
recover from.
As a compromise then, we'll reject an attempt to create a targeted plan if
the plan involves resolving a pending move and if the source address of
that move is not included in the targets.
Our error message offers the user two possible resolutions: to create an
untargeted plan, thus allowing everything to resolve, or to add additional
-target options to include just the existing resource instances that have
pending moves to resolve.
This compromise recognizes that it is possible -- though hopefully rare --
that a user could potentially both be recovering from a bug or mistake at
the same time as processing a move, if e.g. the bug was fixed by upgrading
a module and the new version includes a new "moved" block. In that edge
case, it might be necessary to just add the one additional address to
the targets rather than removing the targets altogether, if creating a
normal untargeted plan is impossible due to whatever bug they're trying to
recover from.
When originally filling out these test cases we didn't yet have the logic
in place to detect chained moves and so this test couldn't succeed in
spite of being correct.
We now have chain-detection implemented and so consequently we can also
detect cyclic chains. This commit largely just enables the original test
unchanged, although it does include the text of the final error message
for reporting cyclic move chains which wasn't yet finalized when we were
stubbing out this test case originally.
The CoerceValue code was not updated to handle NestedTypes, and while
none of the new codepaths make use of this method, there are still some
internal uses.
The original intent of this test was to verify that we properly release
the state lock if terraform.NewContext fails. This was in response to a
bug in an earlier version of Terraform where that wasn't true.
In the recent refactoring that made terraform.NewContext no longer
responsible for provider constraint/checksum verification, this test began
testing a failed plan operation instead, which left the error return path
from terraform.NewContext untested.
An invalid parallelism value is the one remaining case where
terraform.NewContext can return an error, so as a localized fix for this
test I've switched it to just intentionally set an invalid parallelism
value. This is still not ideal because it's still testing an
implementation detail, but I've at least left a comment inline to try to
be clearer about what the goal is here so that we can respond in a more
appropriate way if future changes cause this test to fail again.
In the long run I'd like to move this last remaining check out to be the
responsibility of the CLI layer, with terraform.NewContext either just
assuming the value correct or panicking when it isn't, but the handling
of this CLI option is currently rather awkwardly spread across the
command and backend packages so we'll save that refactoring for a later
date.
The presence of this field was confusing because in practice the local
backend doesn't use it for anything and the remote backend was using it
only to return an error if it's set to anything other than the default,
under the assumption that it would always match ContextOpts.Parallelism.
The "command" package is the one actually responsible for handling this
option, and it does so by placing it into the partial ContextOpts which it
passes into the backend when preparing for a local operation. To make that
clearer, here we remove Operation.Parallelism and change the few uses of
it to refer to ContextOpts.Parallelism instead, so that everyone is
reading and writing this value from the same place.
In order to handle optional attributes, the Variable type needs to keep
track of the type constraint for decoding and conversion, as well as the
concrete type for creating values and type comparison.
Since the Type field is referenced throughout the codebase, and for
future refactoring if the handling of optional attributes changes
significantly, the constraint is now loaded into an entirely new field
called ConstraintType. This prevents types containing
ObjectWithOptionalAttrs from escaping the decode/conversion codepaths
into the rest of the codebase.
Objects with optional attributes are only used for the decoding of
HCL, and those types should never be exposed elsewhere within
terraform.
Separate the external ImpliedType method from the cty.Type generated
internally for the decoder spec. This unfortunately causes our
ImpliedType method to return a different type than the
hcldec.ImpliedType function, but the former is only used within
terraform for concrete values, while the latter is used to decode
HCL. Renaming the ImpliedType methods could be done to further
differentiate them, but that does cause fairly large diff in the
codebase that does not seem worth the effort at this time.
Thus far our various interactions with the bits of state we keep
associated with a working directory have all been implemented directly
inside the "command" package -- often in the huge command.Meta type -- and
not managed collectively via a single component.
There's too many little codepaths reading and writing from the working
directory and data directory to refactor it all in one step, but this is
an attempt at a first step towards a future where everything that reads
and writes from the current working directory would do so via an object
that encapsulates the implementation details and offers a high-level API
to read and write all of these session-persistent settings.
The design here continues our gradual path towards using a dependency
injection style where "package main" is solely responsible for directly
interacting with the OS command line, the OS environment, the OS working
directory, the stdio streams, and the CLI configuration, and then
communicating the resulting information to the rest of Terraform by wiring
together objects. It seems likely that eventually we'll have enough wiring
code in package main to justify a more explicit organization of that code,
but for this commit the new "workdir.Dir" object is just wired directly in
place of its predecessors, without any significant change of code
organization at that top layer.
This first commit focuses on the main files and directories we use to
find provider plugins, because a subsequent commit will lightly reorganize
the separation of concerns for plugin launching with a similar goal of
collecting all of the relevant logic together into one spot.
Previously our graph walker expected to recieve a data structure
containing schemas for all of the provider and provisioner plugins used in
the configuration and state. That made sense back when
terraform.NewContext was responsible for loading all of the schemas before
taking any other action, but it no longer has that responsiblity.
Instead, we'll now make sure that the "contextPlugins" object reaches all
of the locations where we need schema -- many of which already had access
to that object anyway -- and then load the needed schemas just in time.
The contextPlugins object memoizes schema lookups, so we can safely call
it many times with the same provider address or provisioner type name and
know that it'll still only load each distinct plugin once per Context
object.
As of this commit, the Context.Schemas method is now a public interface
only and not used by logic in the "terraform" package at all. However,
that does leave us in a rather tenuous situation of relying on the fact
that all practical users of terraform.Context end up calling "Schemas" at
some point in order to verify that we have all of the expected versions
of plugins. That's a non-obvious implicit dependency, and so in subsequent
commits we'll gradually move all responsibility for verifying plugin
versions into the caller of terraform.NewContext, which'll heal a
long-standing architectural wart whereby the caller is responsible for
installing and locating the plugin executables but not for verifying that
what's installed is conforming to the current configuration and dependency
lock file.
Previously the graph builders all expected to be given a full manifest
of all of the plugin component schemas that they could need during their
analysis work. That made sense when terraform.NewContext would always
proactively load all of the schemas before doing any other work, but we
now have a load-as-needed strategy for schemas.
We'll now have the graph builders use the contextPlugins object they each
already hold to retrieve individual schemas when needed. This avoids the
need to prepare a redundant data structure to pass alongside the
contextPlugins object, and leans on the memoization behavior inside
contextPlugins to preserve the old behavior of loading each provider's
schema only once.
By tolerating ProviderSchema and ProvisionerSchema potentially returning
errors, we can slightly simplify EvalContextBuiltin by having it retrieve
individual schemas when needed directly from the "Plugins" object.
EvalContextBuiltin already needs to be holding a contextPlugins instance
for other reasons anyway, so this allows us to get the same result with
fewer moving parts.