Previously we used a single plan action "Replace" to represent both the
destroy-before-create and the create-before-destroy variants of replacing.
However, this forces the apply graph builder to jump through a lot of
hoops to figure out which nodes need it forced on and rebuild parts of
the graph to represent that.
If we instead decide between these two cases at plan time, the actual
determination of it is more straightforward because each resource is
represented by only one node in the plan graph, and then we can ensure
we put the right nodes in the graph during DiffTransformer and thus avoid
the logic for dealing with deposed instances being spread across various
different transformers and node types.
As a nice side-effect, this also allows us to show the difference between
destroy-then-create and create-then-destroy in the rendered diff in the
CLI, although this change doesn't fully implement that yet.
Remove a test that is no longer needed, since provider must be
explicitly defined for orphaned modules, and is covered in other context
tests.
Udpate a test fixture to better represent the origianl missing map
issue, since the ability to detect nil now made the old test invalid.
This test was relying on an odd definition of "invalid" from prior
versions of Terraform, where it would be an error to access an attribute
that exists as part of the resource type schema but the provider
implementation neglected to set it.
This was an implementation detail though, caused by the flatmap
representation and the fact that core itself didn't have access to the
schema to do static validation. Now the original usage returns a null
value because the "value" attribute is defined, and so we need a new
test fixture that accesses an attribute that is not defined in the schema
at all.
I misunderstood the logic here on the first pass of porting to the new
provider and state types: EvalUndeposeState is supposed to return the
deposed object back to being current again, so we can undo the deposing
in the case where the create leg fails.
If we don't do this, we end up leaving the instance with no current object
at all and with its prior object deposed, and then the later destroy
node deletes that deposed object, leaving the user with no object at all.
For safety we skip this restoration if there _is_ a new current object,
since a failed create can still produce a partial result which we need
to keep to avoid losing track of any remote objects that were successfully
created.
We now correctly prune out empty modules after destroying everything
inside them, so we need to update this expectation string to match the new
behavior, rather than before when it was actually describing a buggy
result.
This was assuming a previous buggy behavior of failing to prune out an
empty module in the state. The new state code doesn't have this bug, so
we must update the expected result to reflect that.
This regressed after the recent changes to include deposed object changes
explicitly in the plan, since it was previously (correctly) asserting that
only the current object was covered by the plan.
Now we assert that both are present, and check that they have the correct
actions associated so that we are sure we're going to only delete the
deposed object and not the current object that sits alongside it.
This shim for the benefit of our old tests was not handling the situation
where an InstanceState only contains deposed instances and no current
instance. This is a strange, rare situation but one that does come up in
practice in some odd cases.
Previously our handling of create_before_destroy -- and of deposed objects
in particular -- was rather "implicit" and spread over various different
subsystems. We'd quietly just destroy every deposed object during a
destroy operation, without any user-visible plan to do so.
Here we make things more explicit by tracking each deposed object
individually by its pseudorandomly-allocated key. There are two different
mechanisms at play here, building on the same concepts:
- During a replace operation with create_before_destroy, we *pre-allocate*
a DeposedKey to use for the prior object in the "apply" node and then
pass that exact id to the destroy node, ensuring that we only destroy
the single object we planned to destroy. In the happy path here the
user never actually sees the allocated deposed key because we use it and
then immediately destroy it within the same operation. However, that
destroy may fail, which brings us to the second mechanism:
- If any deposed objects are already present in state during _plan_, we
insert a destroy change for them into the plan so that it's explicit to
the user that we are going to destroy these additional objects, and then
create an individual graph node for each one in DiffTransformer.
The main motivation here is to be more careful in how we handle these
destroys so that from a user's standpoint we never destroy something
without the user knowing about it ahead of time.
However, this new organization also hopefully makes the code itself a
little easier to follow because the connection between the create and
destroy steps of a Replace is reprseented in a single place (in
DiffTransformer) and deposed instances each have their own explicit graph
node rather than being secretly handled as part of the main instance-level
graph node.
This test was re-using the same context to run three consecutive
plan/apply operations, which is not safe because we will accumulate more
planned changes with each change, creating duplicate entries in the diff.
Instead, to properly simulate a sequence of consecutive runs of Terraform
we must start with a fresh context each time, though still pass forward
the previous state which would in the real world be persisted via a state
manager between these runs.
This inverts the previous logic so that it's the status of an object in
the state that decides whether we'll use its value from the plan. This
fixes the problem that otherwise after we've actually applied the change
the partial planned object will continue to shadow the final object in
state.
These two being distinct is an old-world concept, but we need to ensure
that they match properly here to ensure that we don't leave dangling
incorrect values for "id".
There were two problems with this test as originally written:
- Its ApplyFn was handling a destroy diff by returning a new object, and
thus not actually destroying the object in question at all.
- It was treating unknown values in the diff as invalid during apply, but
these are in fact now expected as a way for the provider to distinguish
whether an optional+computed attribute is set in config.
With those changes in mind, this test isn't really testing anything
special anymore, but is still a straightforward test of a simple
plan+apply running to completion without error.
This test's original outcome was invalid because it didn't actually
configure a mock apply function, and so _any_ apply operation would appear
to have destroyed the object it was given. This was visible in the fact
that the configuration contains aws_instance.bar but yet it was not
present in the expected state string.
Now we use the standard testApplyFn, and update the expected output to
include the aws_instance.bar object that is created by a Create change.
When applying a diff to a value we verify that the "old" value in diff is
consistent with the given prior value, as a safety check.
The mock must comply with this or else any tests that produce diffs with
computed new values will not pass the safety check.
This change is verified by the now-passing TestContext2Apply_taintDep .
These tests were relying on the full InstanceInfo we used to give to
providers, but the new API doesn't do that and so we will instead lean on
the ID from the state to recognize the apply ordering.
Previously testDiffFn was just assuming that the prior value for "type"
was always the empty string, but that doesn't hold if a mocked object is
updated in-place with a previously-populated value for type.
This wasn't a problem before because the old values in the diff were
largely just for presentation to the user, but we do now verify that the
old values match what we're applying to as an extra safety check and so
we must populate the old value properly.
This fix is verified by TestContext2Apply_Provisioner_Diff.
Now that we're marking errored creates as tainted in the state, the
object created by this test will get marked as tainted due to the error
about it containing an unknown value even after apply.
Since this test sets its own special schema for aws_instance, its expected
output must now be adjusted to only expect values that conform to that
schema. The extraneous attributes like "type" which testDiffFn produces
are no longer visible unless declared in schema.
We also need to now declare "id" as a computed attribute in order for our
state stringer shim to properly populate the formerly-special "ID".
We can't use diff attributes to differentiate instances during destroy
because destroy diffs don't have any attributes set in the old API.
Also, our new state management code correctly prunes out the empty
module.child before returning, so our expected output is updated to not
expect that to still be present.
When we re-run EvalDiff during apply, we may have already completed the
destroy leg of a replace operation, leaving us in a different situation
than we were when we made the original planned change.
Therefore as a special case we will allow a create to turn back into a
replace if there was an earlier diff that requested that.
If the prior object is tainted, we behave as if it doesn't exist at all
for most of our logic here but then at the end turn it into a synthetic
replace operation going from the old object to the new object, similarly
to how we'd behave if given an argument change that "requires
replacement".
We can be more relaxed about our rules that a create musn't return null
or a destroy must return null if the provider also itself indicated an
error. In that case, it's expected that the return value is describing a
partial result, and so we'll just store it and move on.
The mechanism for a provider to pre-populate parts of the connection
config for subsequent provisioners is no longer present in the new
protocol, since it was rarely used, poorly documented, and for many
resource types had no obvious good defaults.
Although we have a special case where a result of the wrong type will bail
early, we must keep that set of diagnostics separate so that we can still
run to completion when there are _already_ diagnostics present (from the
provider's response) but the return value _is_ type-conforming.
This fix is verified by TestContext2Apply_provisionerCreateFail.
Our previous mechanism for dealing with tainting relied on directly
mutating the InstanceState object to mark it as such. In our new state
models we consider the instance objects to be immutable by convention, and
so we frequently copy them. As a result, the taint flagging was no longer
making it all the way through the apply evaluation process.
Here we now implement tainting as a separate step in the evaluation
process, creating a copy of the object with a tainted status if there were
any errors during creation.
This introduces a new behavior where any provider-level errors during
creation will also cause an instance to be marked as tainted if any object
is returned at all. Create-time errors _normally_ result in no object at
all, but the provider might return an object if the failure occurred at
a subsequent step of a multi-step creation process and so left behind a
remote object that needs to be cleaned up on a future run.
Since StateReferences was implemented on NodeAbstractResource rather than
NodeAbstractResourceInstance it wasn't properly detecting references to
the same instance as self-references.
Now that we are using "seen" to filter out duplicates we can also simplify
how we handle these self-references by just pretending we saw them before
we even start the loop.
This change is confirmed by
TestContext2Apply_provisionerMultiSelfRefSingle
The error handling here is a bit tricky due to the ability for users to
opt out of aborting on error. It's important that we keep straight the
distinction between applyDiags and diags so we can tell the difference
between the errors from _this_ provisioner and the errors for the entire
run so far.