The init error was output deep in the backend by detecting a
special ResourceProviderError and formatted directly to the CLI.
Create some Diagnostics closer to where the problem is detected, and
passed that back through the normal diagnostic flow. While the output
isn't as nice yet, this restores the helpful error message and makes the
code easier to maintain. Better formatting can be handled later.
A provider may react to a create or update failing by returning error
diagnostics and a partially-updated or nil new value, in which case we
do not expect our AssertObjectCompatible consistency check to succeed: the
provider is just assumed to be doing the best it can to preserve whatever
partial outcome it was able to achieve.
However, if errors are accompanied with a nil new value after an update,
we'll assume that the provider is telling us it wasn't able to get far
enough to make any change at all, and so we'll retain the prior value in
state. This ensures that a provider can't cause an object to be forgotten
from the state just because an update failed.
RequiresReplace paths with IndexSteps that have been added or removed
may fail to apply against one of the two state values. Only error out if
the path cannot be applied to both values.
Prior to Terraform 0.12 there were certain behaviors we expected from
providers that were actually just details of the SDK and not part of the
enforced contract.
For 0.12 we're now codifying some of these behaviors explicitly via safety
checks in core, thus ensuring that all future providers will behave in a
consistent way that users can rely on.
Unfortunately, due to the hand-written nature of the mock provider
implementations we use in tests, they have been getting away with some
unusual behaviors that don't match our usual expectations, and our safety
checks now detect those as incorrect behaviors.
To address this, we make the minimal changes to each test to ensure that
its mock provider behaves in a consistent way, which requires that values
set in config be represented correctly in the plan and ultimately saved
in the new state, without any changes along the way. In particular, the
common testDiffFn implementation has historically used a number of special
hidden attributes to trigger special behaviors, and our new rules require
that these special settings propagate properly through the plan and into
the state.
Due to the inprecision of our shimming from the legacy SDK type system to
the new Terraform Core type system, the legacy SDK produces a number of
inconsistencies that produce only minor quirky behavior or broken
edge-cases. To retain compatibility with those existing weird behaviors,
the legacy SDK opts out of our safety checks.
The intent here is to allow existing providers to continue to do their
previous unsafe behaviors for now, accepting that this will allow certain
quirky bugs from previous releases to persist, and then gradually migrate
away from the legacy SDK and remove this opt-out on a per-resource basis
over time.
As with the apply-time safety check opt-out, this is reserved only for
the legacy SDK and must not be used in any new SDK implementations. We
still include any inconsistencies as warnings in the logs as an aid to
anyone debugging weird behavior, so that they can see situations where
blame may be misplaced in the user-visible error messages.
We've allowed the legacy SDK an opt-out from the post-apply safety checks,
but previously we produced only a generic warning message in that case.
Now instead we'll still run the safety checks, but report the results in
the logs instead of as error diagnostics.
This should allow developers who are debugging strange interactions
between buggy legacy providers to get better insight into what's going
on upstream in order to help explain what's going on when these problems
inevitably get caught by other downstream safety checks when trying to
make use of these invalid results.
We've been gradually adding safety checks of this sort throughout the
lifecycle to help ensure that buggy providers can't introduce
hard-to-diagnose downstream failures and misbehavior. This completes the
set by verifying during plan time that the provider has produced a plan
that actually achieves the goals defined in the configuration.
In particular, this catches the situation where a provider may incorrectly
override a value explicitly set in configuration, which avoids creating
confusion by betraying the reasonable user expectation that referencing an
explicitly-defined attribute will produce exactly the value shown in
configuration.
The helper/schema handling of lists loses empty string values, but
retains the correct count. Only re-count the values if the count is
missing entirely, and allow our shims to re-populate the zero values.
In an earlier commit we changed objchange.ProposedNewObject so that the
task of populating unknown values for attributes not known during apply
is the responsibility of the provider's PlanResourceChange method, rather
than being handled automatically.
However, we were also using objchange.ProposedNewObject to construct the
placeholder new object for a deferred data resource read, and so we
inadvertently broke that deferral behavior. Here we restore the old
behavior by introducing a new function objchange.PlannedDataResourceObject
which is a specialized version of objchange.ProposedNewObject that
includes the forced behavior of populating unknown values, because the
provider gets no opportunity to customize a deferred read.
TestContext2Plan_createBeforeDestroy_depends_datasource required some
updates here because its implementation of PlanResourceChange was not
handling the insertion of the unknown value for attribute "computed".
The other changes here are just in an attempt to make the flow of this
test more obvious, by clarifying that it is simulating a -refresh=false
run, which effectively forces a deferred read since we skip the eager
read that would normally happen in the refresh step.
Now that ProposedNewState uses null to represent Computed attributes not
set in the configuration, the provider must fill in the unknown value for
"computed" in its plan result.
It seems that this test was incorrectly updated during our bulk-fix after
integrating the HCL2 work, but it didn't really matter because the
ReadDataSource function isn't called in the happy path anyway. But to
make the intent clearer here, we also now make ReadDataSource return an
error if it is called, making it explicit that no call is expected.
Data resources do not have a plan/apply distinction, so it is never valid
for a data resource to produce unknown values in its result object.
Unknown values in the data resource _config_ cause us to postpone the read
altogether, so a data source never receives unknown values as input and
therefore may never produce unknown values as output.
The shim layer for the legacy SDK type system is not precise enough to
guarantee it will produce identical results between plan and apply. In
particular, values that are null during plan will often become zero-valued
during apply.
To avoid breaking those existing providers while still allowing us to
introduce this check in the future, we'll introduce a rather-hacky new
flag that allows the legacy SDK to signal that it is the legacy SDK and
thus disable the check.
Once we start phasing out the legacy SDK in favor of one that natively
understands our new type system, we can stop setting this flag and thus
get the additional safety of this check without breaking any
previously-released providers.
No other SDK is permitted to set this flag, and we will remove it if we
ever introduce protocol version 6 in future, assuming that any provider
supporting that protocol will always produce consistent results.
Previously we would allow providers to change anything about the planned
object value during apply, possibly returning an entirely-unrelated object
of the same type. In practice this led to some subtle bugs where a single
planned attribute value would change during apply and cause a downstream
failure due to a dependent resource now seeing input other than what
_it_ expected during plan.
Now we'll produce an explicit error message for this case which places the
blame with the correct party: the upstream resource that changed. Without
this, unexpected changes would often lead to the downstream resource
implementation being blamed in error message even though it was just
reacting to the change from upstream.
As with most errors during apply, we'll still save the updated value in
the state but we'll halt the walk to ensure that the unexpected value
cannot propagate further and cause the result to potentially diverge
greatly from the changeset shown in the plan.
Compared to Terraform 0.11, we expect to see this error in many of the
same cases we saw the "diffs didn't match during apply" error in earlier
versions, since it is likely that many errors of that sort were the result
of unexpected upstream changes being incorrectly blamed on the downstream
resource that then used the result.
Because Terraform Core has traditionally not checked that the final apply
result is consistent with what was planned, some of our apply tests were
producing inconsistent results.
Here we fix all of that so that they produce something compatible with
what they planned. This doesn't actually achieve anything in isolation,
but we're about to start enforcing this consistency in a subsequent
commit.
We were previously using cty.Path.Apply, which serves a similar purpose
but implements the more restrictive traversal behaviors down at the cty
layer. hcl.ApplyPath uses the same rules as HCL expressions and so ensures
consistent behavior with normal user expressions.
cty.Path.Apply also previously had a crashing bug (discussed in #20084)
that was causing a panic here. That has now been fixed in cty, but since
we're no longer using it here that's a moot point. The HCL traversing
implementation has been fuzz-tested and unit tested a lot more thoroughly
so should not run into the same crashers we saw with cty before.
When elements are removed from a list, all attributes may not be present
in the diff. Once the individual attributes diffs are applied, use the
length to truncate the flatmapped list to the correct length.
If there were no matching keys, and there was no diff at all, don't set
a zero count for the container. Normally Providers can't reliably detect
empty vs unset here, but there are some cases that worked.
Missing prefix in map recount. This generally passes tests since the
actual count should already be there and be correct, then ethe extra key
is ignored by the shims.
The previous version assumed the diff could be applied verbatim, and
only used the schema at the top level since diffs are "flat". This
turned out to not work reliably with nested blocks. The new Apply method
is driven completely by the schema, and handles nested blocks separately
from other collections.
In an ideal world, providers are supposed to respond to errors during
apply by returning a partial new state alongside the error diagnostics.
In practice though, our SDK leaves the new value set to nil for certain
errors, which was causing Terraform to "forget" the object altogether by
assuming that the provider intended to say "null".
We now adjust that assumption to apply only in the delete case. In all
other cases (including updates) we retain the prior state if the new
state is given as nil. Although we could potentially fix this in the SDK
itself, I expect this is a likely bug in other future SDKs for other
languages too, so this new assumption is a safer one to make to be
resilient to data loss when providers don't behave perfectly.
Providers that return both nil new value and no errors are considered
buggy, but unfortunately that applies to the mocks in many of our tests,
so for pragmatic reasons we can't generate an error for that case as we do
for other "should never happen" situations. Instead, we'll just retain the
prior value in the state so the user can retry.
There are a few constructs from 0.11 and prior that cause 0.12 parsing to
fail altogether, which previously created a chicken/egg problem because
we need to install the providers in order to run "terraform 0.12upgrade"
and thus fix the problem.
This changes "terraform init" to use the new "early configuration" loader
for module and provider installation. This is built on the more permissive
parser in the terraform-config-inspect package, and so it allows us to
read out the top-level blocks from the configuration while accepting
legacy HCL syntax.
In the long run this will let us do version compatibility detection before
attempting a "real" config load, giving us better error messages for any
future syntax additions, but in the short term the key thing is that it
allows us to install the dependencies even if the configuration isn't
fully valid.
Because backend init still requires full configuration, this introduces a
new mode of terraform init where it detects heuristically if it seems like
we need to do a configuration upgrade and does a partial init if so,
before finally directing the user to run "terraform 0.12upgrade" before
running any other commands.
The heuristic here is based on two assumptions:
- If the "early" loader finds no errors but the normal loader does, the
configuration is likely to be valid for Terraform 0.11 but not 0.12.
- If there's already a version constraint in the configuration that
excludes Terraform versions prior to v0.12 then the configuration is
probably _already_ upgraded and so it's just a normal syntax error,
even if the early loader didn't detect it.
Once the upgrade process is removed in 0.13.0 (users will be required to
go stepwise 0.11 -> 0.12 -> 0.13 to upgrade after that), some of this can
be simplified to remove that special mode, but the idea of doing the
dependency version checks against the liberal parser will remain valuable
to increase our chances of reporting version-based incompatibilities
rather than syntax errors as we add new features in future.
In prior versions of Terraform we permitted inconsistent use of indexes
in resource references, but in as of 0.12 the index usage must correlate
properly with whether "count" is set on the resource.
Since users are likely to have existing configurations with incorrect
usage, here we introduce some specialized error messages for situations
where we can detect such issues statically. This seems to cover all of the
common patterns we've seen in practice.
Some usage patterns will fall back on a less-helpful dynamic error here,
but no configurations coming from 0.11 can end up that way because 0.11
did not permit forms such as aws_instance.no_count[count.index].bar that
this validation would not be able to "see".
Our configuration upgrade tool also contains a fix for this already, but
it takes a more conservative approach of adding the index [1] rather than
[count.index] because it can't be sure (without human help) if correlation
of indices is what was intended.
Previously we used the native slash type for the host platform, but that
leads to issues if the same configuration is applied on both Windows and
non-Windows systems.
Since Windows supports slashes and backslashes, we can safely return
always slashes here and require that users combine the result with
subsequent path parts using slashes, like:
"${path.module}/foo/bar"
Previously the above would lead to an error on Windows if path.module
contained any backslashes.
This is not really possible to unit test directly right now since we
always run our tests on Unix systems and filepath.ToSlash is a no-op on
Unix. However, this does include some tests for the basic behavior to
verify that it's not regressed as a result of this change.
This will need to be reported in the changelog as a potential breaking
change, since anyone who was using Terraform _exclusively_ on Windows may
have been using expressions like "${path.module}foo\\bar" which they will
now need to update.
This fixes#14986.
We already catch indirect cycles through the normal cycle detector, but
we never create self-edges in the graph so we need to handle a direct
self-reference separately here.
The prior behavior was simply to produce an incorrect result (since the
local value wasn't assigned a new value yet).
This fixes#18503.
Older versions of terraform could save the backend hash number in a
value larger than an int.
While we could conditionally decode the state into an intermediary data
structure for upgrade, or detect the specific decode error and modify
the json, it seems simpler to just decode into the most flexible value
for now, which is a uint64.
Make sure that NodeDestroyableDataResource has a ResolvedProvider to
call EvalWriteState. This entails setting the ResolvedProvider in
concreteResourceDestroyable, as well as calling EvalGetProvider in
NodeDestroyableDataResource to load the provider schema.
Even though writing the state for a data destroy node should just be
removing the instance, every instance written sets the Provider for the
entire resource. This means that when scaling back a counted data
source, if the removed instances are written last, the data source will
be missing the provider in the state.
Validate should not require state or changes to be present. Break out
early when using evaluationStateData during walkValidate before checking
state or changes, to prevent errors when indexing resources that haven't
been expanded.
Both depends_on and ignore_changes contain references to objects that we
can validate.
Historically Terraform has not validated these, instead just ignoring
references to non-existent objects. Since there is no reason to refer to
something that doesn't exist, we'll now verify this and return errors so
that users get explicit feedback on any typos they may have made, rather
than just wondering why what they added seems to have no effect.
This is particularly important for ignore_changes because users have
historically used strange values here to try to exploit the fact that
Terraform was resolving ignore_changes against a flatmap. This will give
them explicit feedback for any odd constructs that the configuration
upgrade tool doesn't know how to detect and fix.
If an instance object in state has an earlier schema version number then
it is likely that the schema we're holding won't be able to decode the
raw data that is stored. Instead, we must ask the provider to upgrade it
for us first, which might also include translating it from flatmap form
if it was last updated with a Terraform version earlier than v0.12.
This ends up being a "seam" between our use of int64 for schema versions
in the providers package and uint64 everywhere else. We intend to
standardize on int64 everywhere eventually, but for now this remains
consistent with existing usage in each layer to keep the type conversion
noise contained here and avoid mass-updates to other Terraform components
at this time.
This also includes a minor change to the test helpers for the
backend/local package, which were inexplicably setting a SchemaVersion of
1 on the basic test state but setting the mock schema version to zero,
creating an invalid situation where the state would need to be downgraded.