The previous conservative guarantee that we would not make backwards
incompatible changes to the state file format until at least Terraform
1.1 can now be extended. Terraform 0.14 through 1.1 will be able to
interoperably use state files, so we can update the remote backend
version compatibility check accordingly.
This is a port of https://github.com/hashicorp/terraform/pull/29645
This changes the 'name' strategy to always align the local configured
workspace name and the remote Terraform Cloud workspace, rather than the
implicit use of the 'default' unnamed workspace being used instead.
What this essentially means is that the Cloud integration does not fully
support workspaces when configured for a single TFC workspace (as was
the case with the 'remote' backend), but *does* use the
backend.Workspaces() interface to allow for normal local behaviors like
terraform.workspace to resolve to the correct name. It does this by
always setting the local workspace name when the 'name' strategy is
used, as a part of initialization.
Part of the diff here is exporting all the previously unexported types
for mapping workspaces. The command package (and init in particular)
needs to be able to handle setting the local workspace in this
particular scenario.
Implementing this test was quite a rabbithole, as in order to satisfy
backendTestBackendStates() the workspaces returned from
backend.Workspaces() must match exactly, and the shortcut taken to test
pagination in 3cc58813f0 created an
impossible circumstance that got plastered over with the fact that
prefix filtering is done clientside, not by the API as it should be.
Tagging does not rely on clientside filtering, and expects that the
request made to the TFC API returns exactly those workspaces with the
given tags.
These changes include a better way to test pagination, wherein we
actually create over a page worth of valid workspaces in the mock client
and implement a simplified pagination behavior to match how the TFC API
actually works.
A mostly cosemetic change; The fields 'workspace' and 'prefix' don't
really describe well what they are from a caller, so change these to use
a workspaceMapping struct to convey they are for implementing workspace
mapping strategies from CLI -> TFC
These changes include additions to fulfill the interface for the client
mock, plus moving all that logic (which needn't be duplicated across
both the remote and cloud packages) over to the cloud package under a
dedicated mock client file.
These changes allow cloud blocks to be overridden by backend blocks and
vice versa; the logic follows the current backend behavior of a block
overriding a preceding block in full, with no merges.
This restriction is temporary. Overrides should be allowed, but have the
added complexity of needing to also override a 'backend' block, so this
work is being deferred for now.
With the alternative block introduced in 7bf9b2c7b, this removes the
ability to explicitly declare the 'cloud' backend. The literal backend
interface is an implementation detail and no longer a user-level
concept when using Terraform Cloud.
This is a replacement declaration for using Terraform Cloud as a remote
backend, leaving the literal backend as an implementation detail and not
a user-level concept.
The cloud package intends to implement a new integration for
Terraform Cloud/Enterprise. The purpose of this integration is to better
support TFC users; it will shed some overly generic UX and architecture,
behavior changes that are otherwise backwards incompatible in the remote
backend, and technical debt - all of which are vestiges from before
Terraform Cloud existed.
This initial commit is largely a porting of the existing 'remote'
backend, which will serve as an underlying implementation detail and not
be a typical user-level backend. This is because to re-implement the
literal backend interface is orthogonal to the purpose of this
integration, and can always be migrated away from later.
As this backend is considered an implementation detail, it will not be
registered as a declarable backend. Within these changes it is, for easy
of initial development and a clean diff.
When running `terraform init` against a backend with multiple
workspaces, none of which are the currently indicated local workspace,
Terraform prompts the user to choose a workspace from the list. In
automation, using the `-input=false` argument should disable asking for
input, but previously would hang instead.
When an explicit backend is configured with a configuration which has
not yet been initialized, running `terraform init` performs a state
migration to fetch the remotely stored state in order to operate on it.
Like the previous bug introduced by the recent provider diagnostics
change, this code path was not correctly configured to enable init mode
for the backend, which resulted in a fatal error during init when the
cache dir is deleted.
Setting the `Init` backend option allows this code path to continue
without error when first initializing the backend for state migration.
The new e2e test fails without this change.
When migrating state to an existing Terraform Cloud workspace using the
remote backend, we check the remote version is compatible with the local
one by default.
This commit fixes two bugs in this code:
- If using the "name" strategy for the remote backend, the list of
destination workspaces is empty. This resulted in no version checking
of the remote workspace, and we fell back to the string equality
check.
- The user-specified CLI flag `-ignore-remote-version` was not being
applied for the state migration version checking.
The init command needs to initialize a backend, in order to access
state, in turn to derive provider requirements from state. The backend
initialization step requires building provider factories, which
previously would fail if a lockfile was present without a corresponding
local provider cache.
This commit ensures that in this situation only, errors with the
provider factories are temporarily ignored. This allows us to continue
to initialize the backend, fetch providers, and then report any errors
as necessary.
We test that a deleted provider cache results in an error when running
terraform plan, but previously did not test that running init (as
instructed) would resolve the issue. This (failing) e2e test adds that
step.
We introduced this experiment to gather feedback, and the feedback we saw
led to us deciding to do another round of design work before we move
forward with something to meet this use-case.
In addition to being experimental, this has only been included in alpha
releases so far, and so on both counts it is not protected by the
Terraform v1.0 Compatibility Promises.
The -lock and -lock-timeout flags were removed prior to the release of
1.0 as they were thought to have no effect. This is not true in the case
of state migrations when changing backends. This commit restores these
flags, and adds test coverage for locking during backend state
migration.
Also update the help output describing other boolean flags, showing the
argument as the user would type it rather than the default behavior.
There is a race between the MockSource and ShutdownCh which sometimes
causes this test to fail. Add a HangingSource implementation of Source
which hangs until the context is cancelled, so that there is always time
for a user-initiated shutdown to trigger the cancellation code path
under test.
We don't use this library anywhere else in Terraform, and this backend was
using it only for trivial helpers that are easy to express inline anyway.
The new direct code is also type-checkable, whereas these helper functions
seem to be written using reflection.
This gives us one fewer dependency to worry about and makes the test code
for this backend follow a similar assertions style as the rest of this
codebase.
Ensure that we still check for a stale plan even when it was created
with no previous state.
Create separate errors for incorrect lineage vs incorrect serial.
To prevent confusion when applying a first plan multiple times, only
report it as a stale plan rather than different lineage.
Previously we would reject attempts to delete a workspace if its state
contained any resources at all, even if none of the resources had any
resource instance objects associated with it.
Nowadays there isn't any situation where the normal Terraform workflow
will leave behind resource husks, and so this isn't as problematic as it
might've been in the v0.12 era, but nonetheless what we actually care
about for this check is whether there might be any remote objects that
this state is tracking, and for that it's more precise to look for
non-nil resource instance objects, rather than whole resources.
This also includes some adjustments to our error messaging to give more
information about the problem and to use terminology more consistent with
how we currently talk about this situation in our documentation and
elsewhere in the UI.
We were also using the old State.HasResources method as part of some of
our tests. I considered preserving it to avoid changing the behavior of
those tests, but the new check seemed close enough to the intent of those
tests that it wasn't worth maintaining this method that wouldn't be used
in any main code anymore. I've therefore updated those tests to use
the new HasResourceInstanceObjects method instead.
When a test uses multiple instances of the same provider, we may need to
have separate objects to prevent overwriting of the MockProvider state.
Create a completely new MockProvider in each factory function call
rather than re-using the original provider value.
Running the tool this way ensures that we'll always run the version
selected by our go.mod file, rather than whatever happened to be available
in $GOPATH/bin on the system where we're running this.
This change caused some contexts to now be using a newer version of
staticcheck with additional checks, and so this commit also includes some
changes to quiet the new warnings without any change in overall behavior.
A snapshotDir tracks its current position as part of its state, so we need
to use it via pointer rather than value so that Readdirnames can actually
update that position, or else we'll just get stuck at position zero.
In practice this wasn't hurting anything because we only call Readdir once
on our snapshots, to read the whole directory at once. Still nice to fix
to avoid a gotcha for future maintenence, though.
Make the state match the fixture config. The old test was not
technically invalid, but because it caused multiple instances of the
provider to be created, they were backed by the same MockProvider value
resulting in the `*Called` fields interfering.
The destroy plan should not require a configured provider (the complete
configuration is not evaluated, so they cannot be configured).
Deposed instances were being refreshed during the destroy plan, because
this instance type is only ever destroyed and shares the same
implementation between plan and walkPlanDestroy. Skip refreshing during
walkPlanDestroy.
Have the MockProvider ensure that Configure is always called before any
methods that may require a configured provider.
Ensure the MockProvider *Called fields are zeroed out when re-using the
provider instance.
We have various mechanisms that aim to ensure that the installed provider
plugins are consistent with the lock file and that the lock file is
consistent with the provider requirements, and we do have existing unit
tests for them, but all of those cases mock our fake out at least part of
the process and in the past that's caused us to miss usability
regressions, where we still catch the error but do so at the wrong layer
and thus generate error message lacking useful additional context.
Here we'll add some new end-to-end tests to supplement the existing unit
tests, making sure things work as expected when we assemble the system
together as we would in a release. These tests cover a number of different
ways in which the plugin selections can grow inconsistent.
These new tests all run only when we're in a context where we're allowed
to access the network, because they exercise the real plugin installer
codepath. We could technically build this to use a local filesystem mirror
or other such override to avoid that, but the point here is to make sure
we see the expected behavior in the main case, and so it's worth the
small additional cost of downloading the null provider from the real
registry.
In the original incarnation of Meta.providerFactories we were returning
into a Meta.contextOpts whose signature didn't allow it to return an
error directly, and so we had compromised by making the provider factory
functions themselves return errors once called.
Subsequent work made Meta.contextOpts need to return an error anyway, but
at the time we neglected to update our handling of the providerFactories
result, having it still defer the error handling until we finally
instantiate a provider.
Although that did ultimately get the expected result anyway, the error
ended up being reported from deep in the guts of a Terraform Core graph
walk, in whichever concurrently-visited graph node happened to try to
instantiate the plugin first. This meant that the exact phrasing of the
error message would vary between runs and the reporting codepath didn't
have enough context to given an actionable suggestion on how to proceed.
In this commit we make Meta.contextOpts pass through directly any error
that Meta.providerFactories produces, and then make Meta.providerFactories
produce a special error type so that Meta.Backend can ultimately return
a user-friendly diagnostic message containing a specific suggestion to
run "terraform init", along with a short explanation of what a provider
plugin is.
The reliance here on an implied contract between two functions that are
not directly connected in the callstack is non-ideal, and so hopefully
we'll revisit this further in future work on the overall architecture of
the CLI layer. To try to make this robust in the meantime though, I wrote
it to use the errors.As function to potentially unwrap a wrapped version
of our special error type, in case one of the intervening layers is
changed at some point to wrap the downstream error before returning it.
The codepath for AllAttributesNull was not correct for any nested object
types with collections, and should create single null values for the
correct NestingMode rather than a single object with null attributes.
Since there is no reason to descend into nested object types to create
nullv alues, we can drop the AllAttributesNull function altogether and
create null values as needed during ProposedNew.
The corresponding AllBlockAttributesNull was only called internally in 1
location, and simply delegated to schema.EmptyValue. We can reduce the
package surface area by dropping that function too and calling
EmptyValue directly.
In historical versions of Terraform the responsibility to check this was
inside the terraform.NewContext function, along with various other
assorted concerns that made that function particularly complicated.
More recently, we reduced the responsibility of the "terraform" package
only to instantiating particular named plugins, assuming that its caller
is responsible for selecting appropriate versions of any providers that
_are_ external. However, until this commit we were just assuming that
"terraform init" had correctly selected appropriate plugins and recorded
them in the lock file, and so nothing was dealing with the problem of
ensuring that there haven't been any changes to the lock file or config
since the most recent "terraform init" which would cause us to need to
re-evaluate those decisions.
Part of the game here is to slightly extend the role of the dependency
locks object to also carry information about a subset of provider
addresses whose lock entries we're intentionally disregarding as part of
the various little edge-case features we have for overridding providers:
dev_overrides, "unmanaged providers", and the testing overrides in our
own unit tests. This is an in-memory-only annotation, never included in
the serialized plan files on disk.
I had originally intended to create a new package to encapsulate all of
this plugin-selection logic, including both the version constraint
checking here and also the handling of the provider factory functions, but
as an interim step I've just made version constraint consistency checks
the responsibility of the backend/local package, which means that we'll
always catch problems as part of preparing for local operations, while
not imposing these additional checks on commands that _don't_ run local
operations, such as "terraform apply" when in remote operations mode.
We recently removed the legacy way we used to track the SHA256 hashes of
individual provider executables as part of a plans.Plan, because these
days we want to track the checksums of entire provider packages rather
than just the executable.
In order to achieve that new goal, we can save a copy of the dependency
lock information inside the plan file. This follows our existing precedent
of using exactly the same serialization formats we'd normally use for
this information, and thus we can reuse the existing models and
serializers and be confident we won't lose any detail in the round-trip.
As of this commit there's not yet anything actually making use of this
mechanism. In a subsequent commit we'll teach the main callers that write
and read plan files to include and expect (respectively) dependency
information, verifying that the available providers still match by the
time we're applying the plan.
Previously the planfile.Create function had accumulated probably already
too many positional arguments, and I'm intending to add another one in
a subsequent commit and so this is preparation to make the callsites more
readable (subjectively) and make it clearer how we can extend this
function's arguments to include further components in a plan file.
There's no difference in observable functionality here. This is just
passing the same set of arguments in a slightly different way.
Historically the responsibility for making sure that all of the available
providers are of suitable versions and match the appropriate checksums has
been split rather inexplicably over multiple different layers, with some
of the checks happening as late as creating a terraform.Context.
We're gradually iterating towards making that all be handled in one place,
but in this step we're just cleaning up some old remnants from the
main "terraform" package, which is now no longer responsible for any
version or checksum verification and instead just assumes it's been
provided with suitable factory functions by its caller.
We do still have a pre-check here to make sure that we at least have a
factory function for each plugin the configuration seems to depend on,
because if we don't do that up front then it ends up getting caught
instead deep inside the Terraform runtime, often inside a concurrent
graph walk and thus it's not deterministic which codepath will happen to
catch it on a particular run.
As of this commit, this actually does leave some holes in our checks: the
command package is using the dependency lock file to make sure we have
exactly the provider packages we expect (exact versions and checksums),
which is the most crucial part, but we don't yet have any spot where
we make sure that the lock file is consistent with the current
configuration, and we are no longer preserving the provider checksums as
part of a saved plan.
Both of those will come in subsequent commits. While it's unusual to have
a series of commits that briefly subtracts functionality and then adds
back in equivalent functionality later, the lock file checking is the only
part that's crucial for security reasons, with everything else mainly just
being to give better feedback when folks seem to be using Terraform
incorrectly. The other bits are therefore mostly cosmetic and okay to be
absent briefly as we work towards a better design that is clearer about
where that responsibility belongs.
Only depends_on ancestors for transitive dependencies when we're not
pointed directly at a resource. We can't be much more precise here,
since in order to maintain our guarantee that data sources will wait for
explicit dependencies, if those dependencies happen to be a module,
output, or variable, we have to find some upstream managed resource in
order to check for a planned change.
We must ensure that the terraform required_version is checked as early
as possible, so that new configuration constructs don't cause init to
fail without indicating the version is incompatible.
The loadConfig call before the earlyconfig parsing seems to be unneeded,
and we can delay that to de-tangle it from installing the modules which
may have their own constraints.
TODO: it seems that loadConfig should be able to handle returning the
version constraints in the same manner as loadSingleModule.
Our current implementation of destroy planning includes secretly running a
normal plan first, in order to get its effect of refreshing the state.
Previously our warning about colliding moves would betray that
implementation detail because we'd return it from both of our planning
operations here and thus show the message twice. That would also have
happened in theory for any other warnings emitted by both plan operations,
but it's the move collision warning that made it immediately visible.
We'll now only return warnings from the initial plan if we're also
returning errors from that plan, and thus the warnings of both plans can
never mix together into the same diags and thus we'll avoid duplicating
any warnings.
This does mean that we'd lose any warnings which might hypothetically
emerge only from the hidden normal plan and not from the subsequent
destroy plan, but we'll accept that as an okay tradeoff here because those
warnings are likely to not be super relevant to the destroy case anyway,
or else we'd emit them from the destroy-plan walk too.
The extra feedback information for why resource instance deletion is
planned is now included in the streaming JSON UI output.
We also add an explicit case for no-op actions to switch statements in
this package to ensure exhaustiveness, for future linting.
Minor version increase to deprecate min_items and max_items in nested
types.
Nested types have MinItems and MaxItems fields that were inherited from
the block implementation, but were never validated by Terraform, and are
not supported by the HCL decoder validations. Mark these fields as
deprecated, indicating that the SDK should handle the required
validation.
The previous conservative guarantee that we would not make backwards
incompatible changes to the state file format until at least Terraform
1.1 can now be extended. Terraform 0.14 through 1.1 will be able to
interoperably use state files, so we can update the remote backend
version compatibility check accordingly.
Because our validation rules depend on some dynamic results produced by
actually running the plan, we deal with moves in a "backwards" order where
we try to apply them first -- ignoring anything strange we might find --
and then validate the original statements only after planning.
An unfortunate consequence of that approach is that when the move
statements are invalid it's likely that move execution will not fully
complete, and so the generated plan is likely to be incorrect and might
well include errors resulting from the unresolved moves.
To mitigate that, here we let any move validation errors supersede all
other diagnostics that the plan phase might've generated, in the hope that
it'll help the user focus on fixing the incorrect move statements without
creating confusing by reporting errors that only appeared as a quick of
how Terraform worked around the invalid move statements earlier.
In most cases Terraform will be able to automatically fully resolve all
of the pending move statements before creating a plan, but there are some
edge cases where we can end up wanting to move one object to a location
where another object is already declared.
One relatively-obvious example is if someone uses "terraform state mv" in
order to create a set of resource instance bindings that could never have
arising in normal Terraform use.
A less obvious example arises from the interactions between moves at
different levels of granularity. If we are both moving a module to a new
address and moving a resource into an instance of the new module at the
same time, the old module might well have already had a resource of the
same name and so the resource move will be unresolvable.
In these situations Terraform will move the objects as far as possible,
but because it's never valid for a move "from" address to still be
declared in the configuration Terraform will inevitably always plan to
destroy the objects that didn't find a final home. To give some additional
explanation for that result, here we'll add a warning which describes
what happened.
This is not a particularly actionable warning because we don't really
have enough information to guess what the user intended, but we do at
least prompt that they might be able to use the "terraform state" family
of subcommands to repair the ambiguous situation before planning, if they
want a different result than what Terraform proposed.
The core runtime is now able to specify a reason for some situations when
Terraform plans to delete a resource instance.
This commit makes that information visible in the human-oriented UI. A
previous commit already made the underlying data informing these new hints
visible as part of the machine-oriented (JSON) plan output.
This also removes the bold formatting from the existing "has moved to"
hints, because subjectively it seemed like the result was emphasizing too
many parts of the output and thus somewhat defeating the benefit of the
emphasis in trying to create additional visual hierarchy for sighted users
running Terraform in a terminal. Now only the first line containing the
main action statement will be in bold, and all of the parenthesized
follow-up notes will be unformatted.
There are a few different reasons why a resource instance tracked in the
prior state might be considered an "orphan", but previously we reported
them all identically in the planned changes.
In order to help users understand the reason for a surprising planned
delete, we'll now try to specify an additional reason for the planned
deletion, covering all of the main reasons why that could happen.
This commit only introduces the new detail to the plans.Changes result,
though it also incidentally exposes it as part of the JSON plan result
in order to keep that working without returning errors in these new
cases. We'll expose this information in the human-oriented UI output in
a subsequent commit.
Our previous rule for implicitly moving from IntKey(0) to NoKey would
apply that move even when the current resource configuration uses
for_each, because we were only considering whether "count" were set.
Previously this was relatively harmless because the resource instance in
question would end up planned for deletion anyway: neither an IntKey nor
a NoKey are valid keys for for_each.
Now that we're going to be announcing these moves explicitly in the UI,
it would be confusing to see Terraform report that IntKey moved to NoKey
in a situation where the config changed from count to for_each, so to
address that we'll only generate the implied statement if neither
repetition argument is set.
When planning in refresh-only mode, we must not remove orphaned
resources due to changed count or for_each values from the planned
state. This was previously happening because we failed to pass through
the plan's skip-plan-changes flag to the instance orphan node.
When initializing a backend, if the currently selected workspace does
not exist, the user is prompted to select from the list of workspaces
the backend provides.
Instead, we should automatically select the only workspace available
_if_ that's all that's there.
Although with being a nice bit of polish, this enables future
improvments with Terraform Cloud in potentially removing the implicit
depenency on always using the 'default' workspace when the current
configuration is mapped to a single TFC workspace.
We can also rule out some attribute types as indicating something other
than the legacy SDK.
- Tuple types were not generated at all.
- There were no single objects types, the convention was to use a block
list or set of length 1.
- Maps of objects were not possible to generate, since named blocks were
not implemented.
- Nested collections were not supported, but when they were generated they
would have primitive types.
If structural types are being used, we can be assured that the legacy
SDK SchemaConfigModeAttr is not being used, and the fixup is not needed.
This prevents inadvertent mapping of blocks to structural attributes,
and allows us to skip the fixup overhead when possible.
When we originally stubbed ApplyMoves we didn't know yet how exactly we'd
be using the result, so we made it a double-indexed map allowing looking
up moves in both directions.
However, in practice we only actually need to look up old addresses by new
addresses, and so this commit first removes the double indexing so that
each move is only represented by one element in the map.
We also need to describe situations where a move was blocked, because in
a future commit we'll generate some warnings in those cases. Therefore
ApplyMoves now returns a MoveResults object which contains both a map of
changes and a map of blocks. The map of blocks isn't used yet as of this
commit, but we'll use it in a later commit to produce warnings within
the "terraform" package.
The whole point of UniqueKey is to deal with the fact that we have some
distinct address types which have an identical string representation, but
unfortunately that fact caused us to not notice that we'd incorrectly
made AbsResource.UniqueKey return a no-key instance UniqueKey instead of
its own distinct unique key type.
Remove answers from testInputResponse as they are given, and raise an
error during cleanup if any answers remain unused.
This enables tests to ensure that the expected mock answers are actually
used in a test; previously, an entire branch of code including an input
sequence could be omitted and the test(s) would not fail.
The only test that had unused answers in this map is one leftover from
legacy state migrations, a prompt that was removed in
7c93b2e5e6
Add previous address information to the `planned_change` and
`resource_drift` messages for the streaming JSON UI output of plan and
apply operations.
Here we also add a "move" action value to the `change` object of these
messages, to represent a move-only operation.
As part of this work we also simplify this code to use the plan's
DriftedResources values instead of recomputing the drift from state.
Configuration-driven moves are represented in the plan file by setting
the resource's `PrevRunAddr` to a different value than its `Addr`. For
JSON plan output, we here add a new field to resource changes,
`previous_address`, which is present and non-empty only if the resource
is planned to be moved.
Like the CLI UI, refresh-only plans will include move-only changes in
the resource drift JSON output. In normal plan mode, these are elided to
avoid redundancy with planned changes.
Going back a long time we've had a special magic behavior which tries to
recognize a situation where a module author either added or removed the
"count" argument from a resource that already has instances, and to
silently rename the zeroth or no-key instance so that we don't plan to
destroy and recreate the associated object.
Now we have a more general idea of "move statements", and specifically
the idea of "implied" move statements which replicates the same heuristic
we used to use for this behavior, we can treat this magic renaming rule as
just another "move statement", special only in that Terraform generates it
automatically rather than it being written out explicitly in the
configuration.
In return for wiring that in, we can now remove altogether the
NodeCountBoundary graph node type and its associated graph transformer,
CountBoundaryTransformer. We handle moves as a preprocessing step before
building the plan graph, so we no longer need to include any special nodes
in the graph to deal with that situation.
The test updates here are mainly for the graph builders themselves, to
acknowledge that indeed we're no longer inserting the NodeCountBoundary
vertices. The vertices that NodeCountBoundary previously depended on now
become dependencies of the special "root" vertex, although in many cases
here we don't see that explicitly because of the transitive reduction
algorithm, which notices when there's already an equivalent indirect
dependency chain and removes the redundant edge.
We already have plenty of test coverage for these "count boundary" cases
in the context tests whose names start with TestContext2Plan_count and
TestContext2Apply_resourceCount, all of which continued to pass here
without any modification and so are not visible in the diff. The test
functions particularly relevant to this situation are:
- TestContext2Plan_countIncreaseFromNotSet
- TestContext2Plan_countDecreaseToOne
- TestContext2Plan_countOneIndex
- TestContext2Apply_countDecreaseToOneCorrupted
The last of those in particular deals with the situation where we have
both a no-key instance _and_ a zero-key instance in the prior state, which
is interesting here because to exercises an intentional interaction
between refactoring.ImpliedMoveStatements and refactoring.ApplyMoves,
where we intentionally generate an implied move statement that produces
a collision and then expect ApplyMoves to deal with it in the same way as
it would deal with all other collisions, and thus ensure we handle both
the explicit and implied collisions in the same way.
This does affect some UI-level tests, because a nice side-effect of this
new treatment of this old feature is that we can now report explicitly
in the UI that we're assigning new addresses to these objects, whereas
before we just said nothing and hoped the user would just guess what had
happened and why they therefore weren't seeing a diff.
The backend/local plan tests actually had a pre-existing bug where they
were using a state with a different instance key than the config called
for but getting away with it because we'd previously silently fix it up.
That's still fixed up, but now done with an explicit mention in the UI
and so I made the state consistent with the configuration here so that the
tests would be able to recognize _real_ differences where present, as
opposed to the errant difference caused by that inconsistency.
Per our rule that the content of the state can never make a move statement
invalid, our behavior for two objects trying to occupy the same address
will be to just ignore that and let the object already at the address
take priority.
For the moment this is silent from an end-user perspective and appears
only in our internal logs. However, I'm hoping that our future planned
adjustment to the interface of this function will include some way to
allow reporting these collisions in some end-user-visible way, either as
a separate warning per collision or as a single warning that collects
together all of the collisions into a single message somehow.
This situation can arise both because the previous run state already
contained an object at the target address of a move and because more than
one move ends up trying to target the same location. In the latter case,
which one "wins" is decided by our depth-first traversal order, which is
in turn derived from our chaining and nesting rules and is therefore
arbitrary but deterministic.
This new function complements the existing function FindMoveStatements
by potentially generating additional "implied" move statements that aren't
written explicit in the configuration but that we'll infer by comparing
the configuration and te previous run state.
The goal here is to infer only enough to replicate the effect of the
"count boundary fixup" graph node (terraform.NodeCountBoundary) that we
currently use to deal with this concern of preserving the zero-instance
when switching between "count" and not "count".
This is just dead code for now. A subsequent commit will introduce this
into the "terraform" package while also removing
terraform.NodeCountBoundary, thus achieving the same effect as before but
in a way that'll get reported in the UI as a move, using the same language
that we'd use for an explicit move statement.
This is similar to the existing SelectsModule method, returning true if
the reciever selects either a particular resource as a whole or any of the
instances of that resource.
We don't need this test in the normal case, but we will need it in a
subsequent commit when we'll be possibly generating _implied_ move
statements between instances of resources, but only if there aren't
explicit move statements mentioning those resources already.
The set of drifted resources now includes move-only changes, where the
object value is identical but a move has been executed. In normal
operation, we previousl displayed these moves twice: once as part of
drift output, and once as part of planned changes.
As of this commit we omit move-only changes from drift display, except
for refresh-only plans. This fixes the redundant output.
Previously, drifted resources included only updates and deletes. To
correctly display the full changes which would result as part of a
refresh-only apply, the drifted resources must also include move-only
changes.
Colorizing the result of an interpolated string can result in
incorrect output, if the values used to generate the string happen to
include color codes such as `[red]` or `[bold]`. Instead we should
always colorize the format string before calling functions like
`Sprintf`. This commit fixes all instances in this file.
Rather than delaying resource drift detection until it is ready to be
presented, here we perform that computation after the plan walk has
completed. The resulting drift is represented like planned resource
changes, using a slice of ResourceInstanceChangeSrc values.
Because "moved" blocks produce changes that span across more than one
resource instance address at the same time, we need to take extra care
with them during planning.
The -target option allows for restricting Terraform's attention only to
a subset of resources when planning, as an escape hatch to recover from
bugs and mistakes.
However, we need to avoid any situation where only one "side" of a move
would be considered in a particular plan, because that'd create a new
situation that would be otherwise unreachable and would be difficult to
recover from.
As a compromise then, we'll reject an attempt to create a targeted plan if
the plan involves resolving a pending move and if the source address of
that move is not included in the targets.
Our error message offers the user two possible resolutions: to create an
untargeted plan, thus allowing everything to resolve, or to add additional
-target options to include just the existing resource instances that have
pending moves to resolve.
This compromise recognizes that it is possible -- though hopefully rare --
that a user could potentially both be recovering from a bug or mistake at
the same time as processing a move, if e.g. the bug was fixed by upgrading
a module and the new version includes a new "moved" block. In that edge
case, it might be necessary to just add the one additional address to
the targets rather than removing the targets altogether, if creating a
normal untargeted plan is impossible due to whatever bug they're trying to
recover from.
When originally filling out these test cases we didn't yet have the logic
in place to detect chained moves and so this test couldn't succeed in
spite of being correct.
We now have chain-detection implemented and so consequently we can also
detect cyclic chains. This commit largely just enables the original test
unchanged, although it does include the text of the final error message
for reporting cyclic move chains which wasn't yet finalized when we were
stubbing out this test case originally.
The CoerceValue code was not updated to handle NestedTypes, and while
none of the new codepaths make use of this method, there are still some
internal uses.
The original intent of this test was to verify that we properly release
the state lock if terraform.NewContext fails. This was in response to a
bug in an earlier version of Terraform where that wasn't true.
In the recent refactoring that made terraform.NewContext no longer
responsible for provider constraint/checksum verification, this test began
testing a failed plan operation instead, which left the error return path
from terraform.NewContext untested.
An invalid parallelism value is the one remaining case where
terraform.NewContext can return an error, so as a localized fix for this
test I've switched it to just intentionally set an invalid parallelism
value. This is still not ideal because it's still testing an
implementation detail, but I've at least left a comment inline to try to
be clearer about what the goal is here so that we can respond in a more
appropriate way if future changes cause this test to fail again.
In the long run I'd like to move this last remaining check out to be the
responsibility of the CLI layer, with terraform.NewContext either just
assuming the value correct or panicking when it isn't, but the handling
of this CLI option is currently rather awkwardly spread across the
command and backend packages so we'll save that refactoring for a later
date.
The presence of this field was confusing because in practice the local
backend doesn't use it for anything and the remote backend was using it
only to return an error if it's set to anything other than the default,
under the assumption that it would always match ContextOpts.Parallelism.
The "command" package is the one actually responsible for handling this
option, and it does so by placing it into the partial ContextOpts which it
passes into the backend when preparing for a local operation. To make that
clearer, here we remove Operation.Parallelism and change the few uses of
it to refer to ContextOpts.Parallelism instead, so that everyone is
reading and writing this value from the same place.
In order to handle optional attributes, the Variable type needs to keep
track of the type constraint for decoding and conversion, as well as the
concrete type for creating values and type comparison.
Since the Type field is referenced throughout the codebase, and for
future refactoring if the handling of optional attributes changes
significantly, the constraint is now loaded into an entirely new field
called ConstraintType. This prevents types containing
ObjectWithOptionalAttrs from escaping the decode/conversion codepaths
into the rest of the codebase.
Objects with optional attributes are only used for the decoding of
HCL, and those types should never be exposed elsewhere within
terraform.
Separate the external ImpliedType method from the cty.Type generated
internally for the decoder spec. This unfortunately causes our
ImpliedType method to return a different type than the
hcldec.ImpliedType function, but the former is only used within
terraform for concrete values, while the latter is used to decode
HCL. Renaming the ImpliedType methods could be done to further
differentiate them, but that does cause fairly large diff in the
codebase that does not seem worth the effort at this time.
Thus far our various interactions with the bits of state we keep
associated with a working directory have all been implemented directly
inside the "command" package -- often in the huge command.Meta type -- and
not managed collectively via a single component.
There's too many little codepaths reading and writing from the working
directory and data directory to refactor it all in one step, but this is
an attempt at a first step towards a future where everything that reads
and writes from the current working directory would do so via an object
that encapsulates the implementation details and offers a high-level API
to read and write all of these session-persistent settings.
The design here continues our gradual path towards using a dependency
injection style where "package main" is solely responsible for directly
interacting with the OS command line, the OS environment, the OS working
directory, the stdio streams, and the CLI configuration, and then
communicating the resulting information to the rest of Terraform by wiring
together objects. It seems likely that eventually we'll have enough wiring
code in package main to justify a more explicit organization of that code,
but for this commit the new "workdir.Dir" object is just wired directly in
place of its predecessors, without any significant change of code
organization at that top layer.
This first commit focuses on the main files and directories we use to
find provider plugins, because a subsequent commit will lightly reorganize
the separation of concerns for plugin launching with a similar goal of
collecting all of the relevant logic together into one spot.
Previously our graph walker expected to recieve a data structure
containing schemas for all of the provider and provisioner plugins used in
the configuration and state. That made sense back when
terraform.NewContext was responsible for loading all of the schemas before
taking any other action, but it no longer has that responsiblity.
Instead, we'll now make sure that the "contextPlugins" object reaches all
of the locations where we need schema -- many of which already had access
to that object anyway -- and then load the needed schemas just in time.
The contextPlugins object memoizes schema lookups, so we can safely call
it many times with the same provider address or provisioner type name and
know that it'll still only load each distinct plugin once per Context
object.
As of this commit, the Context.Schemas method is now a public interface
only and not used by logic in the "terraform" package at all. However,
that does leave us in a rather tenuous situation of relying on the fact
that all practical users of terraform.Context end up calling "Schemas" at
some point in order to verify that we have all of the expected versions
of plugins. That's a non-obvious implicit dependency, and so in subsequent
commits we'll gradually move all responsibility for verifying plugin
versions into the caller of terraform.NewContext, which'll heal a
long-standing architectural wart whereby the caller is responsible for
installing and locating the plugin executables but not for verifying that
what's installed is conforming to the current configuration and dependency
lock file.
Previously the graph builders all expected to be given a full manifest
of all of the plugin component schemas that they could need during their
analysis work. That made sense when terraform.NewContext would always
proactively load all of the schemas before doing any other work, but we
now have a load-as-needed strategy for schemas.
We'll now have the graph builders use the contextPlugins object they each
already hold to retrieve individual schemas when needed. This avoids the
need to prepare a redundant data structure to pass alongside the
contextPlugins object, and leans on the memoization behavior inside
contextPlugins to preserve the old behavior of loading each provider's
schema only once.
By tolerating ProviderSchema and ProvisionerSchema potentially returning
errors, we can slightly simplify EvalContextBuiltin by having it retrieve
individual schemas when needed directly from the "Plugins" object.
EvalContextBuiltin already needs to be holding a contextPlugins instance
for other reasons anyway, so this allows us to get the same result with
fewer moving parts.
The responsibility for actually instantiating a single plugin and reading
out its schema now belongs to the contextPlugins type, which memoizes the
results by each plugin's unique identifier so that we can avoid retrieving
the same schemas multiple times when working with the same context.
This doesn't change the API of Context.Schemas but it does restore the
spirit of an earlier version of terraform.Context which did all of the
schema loading proactively inside terraform.NewContext. In an earlier
commit we reduced the scope of terraform.NewContext, making schema loading
a separate step, but in the process of doing that removed the effective
memoization of the schema results that terraform.NewContext was providing.
The memoization here will play out in a different way than before, because
we'll be treating each plugin call as separate rather than proactively
loading them all up front, but this is effectively the same because all
of our operation methods on Context call Context.Schemas early in their
work and thus end up forcing all of the necessary schemas to load up
front nonetheless.
In the v0.12 timeframe we made contextComponentFactory an interface with
the expectation that we'd write mocks of it for tests, but in practice we
ended up just always using the same "basicComponentFactory" implementation
throughout.
In the interests of simplification then, here we replace that interface
and its sole implementation with a new concrete struct type
contextPlugins.
Along with the general benefit that this removes an unneeded indirection,
this also means that we can add additional methods to the struct type
without the usual restriction that interface types prefer to be small.
In particular, in a future commit I'm planning to add methods for loading
provider and provisioner schemas, working with the currently-unused new
fields this commit has included in contextPlugins, as compared to its
predecessor basicComponentFactory.
In earlier Terraform versions we used the set of all available plugins of
each type to make graph-building decisions, but in modern Terraform we
make those decisions based entirely on the configuration.
Consequently, we no longer need the methods which can enumerate all of the
known plugin components of a given type. Instead, we just try to
instantiate each of the plugins that the configuration refers to and then
handle the error when that fails, which typically means that the user
needs to run "terraform init" to install some new plugins.
In earlier incarnations of these transformers we used the set of all
available providers for tasks such as generating implied provider
configuration nodes.
However, in modern Terraform we can extract all of the information we need
from the configuration itself, and so these transformers weren't actually
using this set of provider addresses.
These also ended up getting left behind as sets of string rather than sets
of addrs.Provider in our earlier refactoring work, which didn't really
matter because the result wasn't used anywhere anyway.
Rather than updating these to use addrs.Provider instead, I've just
removed the unused arguments entirely in the hope of making it easier to
see what inputs these transformers use to make their decisions.
The public interface for loading schemas is Context.Schemas, which can
take into account the context's records of which plugin versions and
checksums we're expecting. loadSchemas is an implementation detail of
that, representing the part we run only after we've verified all of the
plugins.
For resources which are planned to move, render the previous run address
as additional information in the plan UI. For the case of a move-only
resource (which otherwise is unchanged), we also render that as a
planned change, but without any corresponding action symbol.
If all changes in the plan are moves without changes, the plan is no
longer considered "empty". In this case, we skip rendering the action
symbols in the UI.
This package level variable can be overridden at link time to allow
temporarily disabling the UI warning when experimental features are
enabled. This makes it easier to understand how UI will render when the
feature is no longer experimental.
This change is only for those developing Terraform.
This is the first test exercising the basic functionality of config-driven
move. We previously had it skipped because Terraform's previous design
of treating all three of the state artifacts as mutable attributes of
terraform.Context meant that it was too late during planning to deal with
the move operations, and thus this test was failing.
Thanks to the previous commit, which changes the terraform.Context API
such that we can defer creating the three state artifacts until we're
already doing planning, this test now works and shows Terraform correctly
handling a resource that was formerly called "a" and is now called "b",
with a "moved" block recording that renaming.
Previously terraform.Context was built in an unfortunate way where all of
the data was provided up front in terraform.NewContext and then mutated
directly by subsequent operations. That made the data flow hard to follow,
commonly leading to bugs, and also meant that we were forced to take
various actions too early in terraform.NewContext, rather than waiting
until a more appropriate time during an operation.
This (enormous) commit changes terraform.Context so that its fields are
broadly just unchanging data about the execution context (current
workspace name, available plugins, etc) whereas the main data Terraform
works with arrives via individual method arguments and is returned in
return values.
Specifically, this means that terraform.Context no longer "has-a" config,
state, and "planned changes", instead holding on to those only temporarily
during an operation. The caller is responsible for propagating the outcome
of one step into the next step so that the data flow between operations is
actually visible.
However, since that's a change to the main entry points in the "terraform"
package, this commit also touches every file in the codebase which
interacted with those APIs. Most of the noise here is in updating tests
to take the same actions using the new API style, but this also affects
the main-code callers in the backends and in the command package.
My goal here was to refactor without changing observable behavior, but in
practice there are a couple externally-visible behavior variations here
that seemed okay in service of the broader goal:
- The "terraform graph" command is no longer hooked directly into the
core graph builders, because that's no longer part of the public API.
However, I did include a couple new Context functions whose contract
is to produce a UI-oriented graph, and _for now_ those continue to
return the physical graph we use for those operations. There's no
exported API for generating the "validate" and "eval" graphs, because
neither is particularly interesting in its own right, and so
"terraform graph" no longer supports those graph types.
- terraform.NewContext no longer has the responsibility for collecting
all of the provider schemas up front. Instead, we wait until we need
them. However, that means that some of our error messages now have a
slightly different shape due to unwinding through a differently-shaped
call stack. As of this commit we also end up reloading the schemas
multiple times in some cases, which is functionally acceptable but
likely represents a performance regression. I intend to rework this to
use caching, but I'm saving that for a later commit because this one is
big enough already.
The proximal reason for this change is to resolve the chicken/egg problem
whereby there was previously no single point where we could apply "moved"
statements to the previous run state before creating a plan. With this
change in place, we can now do that as part of Context.Plan, prior to
forking the input state into the three separate state artifacts we use
during planning.
However, this is at least the third project in a row where the previous
API design led to piling more functionality into terraform.NewContext and
then working around the incorrect order of operations that produces, so
I intend that by paying the cost/risk of this large diff now we can in
turn reduce the cost/risk of future projects that relate to our main
workflow actions.
Here we wire through the "move results" into the graph walk data
structures so that all of the the nodes which produce
plans.ResourceInstanceChange values can capture the "PrevRunAddr" for
each resource instance.
This doesn't actually quite work yet, because the logic in Context.Plan
isn't actually correct and so the updated state from
refactoring.ApplyMoves isn't actually visible as the "previous run state".
For that reason, the context test in this commit is currently skipped,
with the intent of re-enabling it once the updated state is properly
propagating into the plan graph walk and thus we can actually react to
the result of the move while choosing actions for those addresses.
In order to expose the effect of any relevant "moved" statements we dealt
with prior to creating the plan, we'll record with each
ResourceInstanceChange both is current address and the address it was
tracked at for the previous run.
To save consumers of these objects from having to special-case the
situation where there _was_ no previous run (e.g. because this is a Create
change), we'll just pretend the previous run address was the same as the
current address in that case, the same as for an update without any
renaming in effect.
This includes a breaking change to the plan file format, but one that
doesn't require a version number increment because there is no ambiguity
between the two formats and so mismatched parsers will already fail with
an error message.
As of this commit we've just added the new field but not yet populated it
with any useful information: it always just matches Addr. A future commit
will wire this up to the result of applying the moves so that we can
populate it correctly. We also don't yet expose this new information
anywhere in the UI layer.
While blocks were not allowed to be computed by the provider, nested
objects can be. Remove the errors regarding blocks and verify unknown
values are valid.
We have a few different .proto files in this repository that all need to
get recompiled into .pb.go files each time we change them, but we were
previously handling that with some scripts that just assumed that protoc
and the relevant plugins were already installed on the system somewhere,
at the right versions.
In practice we've been constantly flopping between different versions of
these tools due to folks having different versions installed in their
development environments. In particular, the state of the .pb.go files
in the prior commit wasn't reproducible by any single version of the tools
because they've all slightly diverged from one another.
In the interests of being more consistent here and avoiding accidental
inconsistencies, we'll now centralize the protocol buffer compile steps
all into a single tool that knows how to fetch and install the expected
versions of the various tools we need and then run those tools with the
right options to get a stable result.
If we want to upgrade to either a newer protoc or a newer protoc-gen-go
in future then we'll do that in a central location and update all of the
.pb.go files at the same time, so that we're always consistently tracking
the same version of protocol buffers everywhere.
While doing this I attempted to keep as close as possible to the toolchain
we'd most recently used, but since they were not consistent with each
other they've now all changed which version numbers they record at minimum,
and the planproto stub in particular now also has a slightly different
descriptor serialization but is otherwise offering the same API.
CanChainFrom needs to be able to handle move statements from different
relative modules, re-implementing with addrs.anyKey
Add the anyKey InstanceKey value to the addrs package to simplify module
path comparison. This allows all combinations of module path
representation to be normalized into a ModuleInstance which can be
compared directly, rather than dealing with multiple levels of different
prefix types.
The -force-copy flag to init should automatically migrate state.
Previously this was not applied to one case: when migrating from
a backend with multiple workspaces to another backend supporting
multiple workspaces. I believe this was an oversight so this commit
fixes that.
We're aware of several quirks of this command's current design, which
result from some existing architectural limitations that we can't address
immediately.
However, we do still want to make this command available in its current
capacity as an incremental improvement, so as a compromise we'll document
it as experimental. Our intent here is to exclude it from the
Terraform 1.0 Compatibility Promises so that we can have the space to
continue to improve the design as other parts of the overall Terraform
system gain new capabilities.
We don't currently have any concrete plan for this command to be
stabilized and subject to compatibility promises. That decision will
follow from ongoing discussions with other teams whose systems may need to
change in order to support the final design of "terraform add".
The code adopted from block diffs was not set to handle null and unknown
values, as those are not allowed for blocks.
We also revert the change to formatting nested object types as single
attributes, because the attribute formatter cannot handle sensitive
values from the schema. This presents some awkward syntax for diffs for
now, but should suffice until the entire formatter can be refactored to
better handle these new nested types.
Go 1.17 includes a breaking change to both net.ParseIP and net.ParseCIDR
functions to reject IPv4 address octets written with leading zeros.
Our use of these functions as part of the various CIDR functions in the
Terraform language doesn't have the same security concerns that the Go
team had in evaluating this change to the standard library, and so we
can't justify an exception to our v1.0 compatibility promises on the same
sort of security grounds that the Go team used to justify their
compatibility exception.
For that reason, we'll now use our own fork of the Go library functions
which has the new check disabled in order to preserve the prior behavior.
We're taking this path, rather than pre-normalizing the IP address before
calling into the standard library, because an additional normalization
layer would be entirely new code and additional complexity, whereas this
fork is relatively minor in terms of code size and avoids any significant
changes to our own calls to these functions.
Thanks to the Kubernetes team for their prior work on carving out a subset
of the "net" package for their similar backward-compatibility concern.
Our "ipaddr" package here is a lightly-modified fork of their fork, with
only the comments changed to talk about Terraform instead of Kubernetes.
This fork is not intended for use in any other future feature
implementations, because they wouldn't be subject to the same
compatibility constraints as our existing functions. We will use these
forked implementations for new callers only if consistency with the
behavior of the existing functions is a key requirement.
This includes the addition of the new "//go:build" comment form in addition
to the legacy "// +build" notation, as produced by gofmt to ensure
consistent behavior between Go versions. The new directives are all
equivalent to what was present before, so there's no change in behavior.
Go 1.17 continues to use the Unicode 13 tables as in Go 1.16, so this
upgrade does not require also upgrading our Unicode-related dependencies.
This upgrade includes the following breaking changes which will also
appear as breaking changes for Terraform users, but that are consistent
with the Terraform v1.0 compatibility promises.
- On MacOS, Terraform now requires macOS 10.13 High Sierra or later.
This upgrade also includes the following breaking changes which will
appear as breaking changes for Terraform users that are inconsistent with
our compatibility promises, but have justified exceptions as follows:
- cidrsubnet, cidrhost, and cidrnetmask will now reject IPv4 CIDR
addresses whose decimal components have leading zeros, where previously
they would just silently ignore those leading zeros.
This is a security-motivated exception to our compatibility promises,
because some external systems interpret zero-prefixed octets as octal
numbers rather than decimal, and thus the previous lenient parsing could
lead to a different interpretation of the address between systems, and
thus potentially allow bypassing policy when configuring firewall rules
etc.
This upgrade also includes the following breaking changes which could
_potentially_ appear as breaking changes for Terraform users, but that do
not in practice for the reasons given:
- The Go net/url package no longer allows query strings with pairs
separated by semicolons instead of ampersands. This primarily affects
HTTP servers written in Go, and Terraform includes a special temporary
HTTP server as part of its implementation of OAuth for "terraform login",
but that server only needs to accept URLs created by Terraform itself
and Terraform does not generate any URLs that would be rejected.
Add implementations of CanChainFrom and NestedWithin for
MoveEndpointInModule.
CanChainFrom allows the linking of move statements of the same address,
which means the prior destination address must equal the following
source address. If the destination and source addresses are of different
types, they must be covered by NestedWithin rather than CanChainFrom.
NestedWithin checks if the destination contains the source address. Any
matching types would be covered by CanChainFrom.