This commit adds a new callback, DiffSuppressFunc, to the schema.Schema
structure. If set for a given schema, a callback to the user-supplied
function will be made for each attribute for which the default
type-based diff mechanism produces an attribute diff. Returning `true`
from the callback will suppress the diff (i.e. pretend there was no
diff), and returning false will retain it as part of the plan.
There are a number of motivating examples for this - one of which is
included as an example:
1. On SSH public keys, trailing whitespace does not matter in many
cases - and in some cases it is added by provider APIs. For
digitalocean_ssh_key resources we previously had a StateFunc that
trimmed the whitespace - we now have a DiffSuppressFunc which
verifies whether the trimmed strings are equivalent.
2. IAM policy equivalence for AWS. A good proportion of AWS issues
relate to IAM policies which have been "normalized" (used loosely)
by the IAM API endpoints. This can make the JSON strings differ
from those generated by iam_policy_document resources or template
files, even though the semantics are the same (for example,
reordering of `bucket-prefix/` and `bucket-prefix/*` in an S3
bucket policy. DiffSupressFunc can be used to test for semantic
equivalence rather than pure text equivalence, but without having to
deal with the complexity associated with a full "provider-land" diff
implementation without helper/schema.
Fixes issue where a resource marked as tainted with no other attribute
diffs would never show up in the plan or apply as needing to be
replaced.
One unrelated test needed updating due to a quirk in the testDiffFn
logic - it adds a "type" field diff if the diff is non-Empty. NBD
Although DiffFieldReader was the one mostly responsible for a buggy behaviour
more tests were added throughout the debugging process most of which
would fail without the bugfix.
- ResourceData
- MultiLevelFieldReader
- MapFieldReader
- DiffFieldReader
The helper/schema framework for building providers previously validated
in all cases that each field being set in state was in the schema.
However, in order to support remote state in a usable fashion, the need
has arisen for the top level attributes of the resource to be created
dynamically. In order to still be able to use helper/schema, this commit
adds the capability to assign additional fields.
Though I do not forsee this being used by providers other than remote
state (and that eventually may move into Terraform Core rather than
being a provider), the usage and semantics are:
To opt into dynamic attributes, add a schema attribute named
"__has_dynamic_attributes", and make it an optional string with no
default value, in order that it does not appear in diffs:
"__has_dynamic_attributes": {
Type: schema.TypeString
Optional: true
}
In the read callback, use the d.UnsafeSetFieldRaw(key, value) function
to set the dynamic attributes.
Note that other fields in the schema _are_ copied into state, and that
the names of the schema fields cannot currently be used as dynamic
attribute names, as we check to ensure a value is not already set for a
given key.
The serializeCollectionMemberForHash helper can't be called for the
MapType values, because MapType doesn't have a schema.Elem. Instead, we
can write the key/value pairs directly to the buffer. This still doesn't
allow for nested maps or lists, but we need to define that use case
before committing to it here.
The flatmapped representation of state prior to this commit encoded maps
and lists (and therefore by extension, sets) with a key corresponding to
the number of elements, or the unknown variable indicator under a .# key
and then individual items. For example, the list ["a", "b", "c"] would
have been encoded as:
listname.# = 3
listname.0 = "a"
listname.1 = "b"
listname.2 = "c"
And the map {"key1": "value1", "key2", "value2"} would have been encoded
as:
mapname.# = 2
mapname.key1 = "value1"
mapname.key2 = "value2"
Sets use the hash code as the key - for example a set with a (fictional)
hashcode calculation may look like:
setname.# = 2
setname.12312512 = "value1"
setname.56345233 = "value2"
Prior to the work done to extend the type system, this was sufficient
since the internal representation of these was effectively the same.
However, following the separation of maps and lists into distinct
first-class types, this encoding presents a problem: given a state file,
it is impossible to tell the encoding of an empty list and an empty map
apart. This presents problems for the type checker during interpolation,
as many interpolation functions will operate on only one of these two
structures.
This commit therefore changes the representation in state of maps to use
a "%" as the key for the number of elements. Consequently the map above
will now be encoded as:
mapname.% = 2
mapname.key1 = "value1"
mapname.key2 = "value2"
This has the effect of an empty list (or set) now being encoded as:
listname.# = 0
And an empty map now being encoded as:
mapname.% = 0
Therefore we can eliminate some nasty guessing logic from the resource
variable supplier for interpolation, at the cost of having to migrate
state up front (to follow in a subsequent commit).
In order to reduce the number of potential situations in which resources
would be "forced new", we continue to accept "#" as the count key when
reading maps via helper/schema. There is no situation under which we can
allow "#" as an actual map key in any case, as it would not be
distinguishable from a list or set in state.
This an effort to address hashicorp/terraform#516.
Adding the Sensitive attribute to the resource schema, opening up the
ability for resource maintainers to mark some fields as sensitive.
Sensitive fields are hidden in the output, and, possibly in the future,
could be encrypted.
This means it’s shown correctly in a plan and takes into account any
actions that are dependant on the tainted resource and, vice verse, any
actions that the tainted resource depends on.
So this changes the behaviour from saying this resource is tainted so
just forget about it and make sure it gets deleted in the background,
to saying I want that resource to be recreated (taking into account the
existing resource and it’s place in the graph).
For backward compatibility we will continue to support using the data
sources that were formerly logical resources as resources for the moment,
but we want to warn the user about it since this support is likely to
be removed in future.
This is done by adding a new "deprecation message" feature to
schema.Resource, but for the moment this is done as an internal feature
(not usable directly by plugins) so that we can collect additional
use-cases and design a more general interface before creating a
compatibility constraint.
Historically we've had some "read-only" and "logical" resources. With the
addition of the data source concept these will gradually become data
sources, but we need to retain backward compatibility with existing
configurations that use the now-deprecated resources.
This shim is intended to allow us to easily create a resource from a
data source implementation. It adjusts the schema as needed and adds
stub Create and Delete implementations.
This would ideally also produce a deprecation warning whenever such a
shimmed resource is used, but the schema system doesn't currently have
a mechanism for resource-specific validation, so that remains just a TODO
for the moment.
In the "schema" layer a Resource is just any "thing" that has a schema
and supports some or all of the CRUD operations. Data sources introduce
a new use of Resource to represent read-only resources, which require
some different InternalValidate logic.
This is a breaking change to the ResourceProvider interface that adds the
new operations relating to data sources.
DataSources, ValidateDataSource, ReadDataDiff and ReadDataApply are the
data source equivalents of Resources, Validate, Diff and Apply (respectively)
for managed resources.
The diff/apply model seems at first glance a rather strange workflow for
read-only resources, but implementing data resources in this way allows them
to fit cleanly into the standard plan/apply lifecycle in cases where the
configuration contains computed arguments and thus the read must be deferred
until apply time.
Along with breaking the interface, we also fix up the plugin client/server
and helper/schema implementations of it, which are all of the callers
used when provider plugins use helper/schema. This would be a breaking
change for any provider plugin that directly implements the provider
interface, but no known plugins do this and it is not recommended.
At the helper/schema layer the implementer sees ReadDataApply as a "Read",
as opposed to "Create" or "Update" as in the managed resource Apply
implementation. The planning mechanics are handled entirely within
helper/schema, so that complexity is hidden from the provider implementation
itself.
This adds a test and the support necessary to read from native maps
passed as variables via interpolation - for example:
```
resource ...... {
mapValue = "${var.map}"
}
```
We also add support for interpolating maps from the flat-mapped resource
config, which is necessary to support assignment of computed maps, which
is now valid.
Unfortunately there is no good way to distinguish between a list and a
map in the flatmap. In lieu of changing that representation (which is
risky), we assume that if all the keys are numeric, this is intended to
be a list, and if not it is intended to be a map. This does preclude
maps which have purely numeric keys, which should be noted as a
backwards compatibility concern.
This commit adds support for native list variables and outputs, building
up on the previous change to state. Interpolation functions now return
native lists in preference to StringList.
List variables are defined like this:
variable "test" {
# This can also be inferred
type = "list"
default = ["Hello", "World"]
}
output "test_out" {
value = "${var.a_list}"
}
This results in the following state:
```
...
"outputs": {
"test_out": [
"hello",
"world"
]
},
...
```
And the result of terraform output is as follows:
```
$ terraform output
test_out = [
hello
world
]
```
Using the output name, an xargs-friendly representation is output:
```
$ terraform output test_out
hello
world
```
The output command also supports indexing into the list (with
appropriate range checking and no wrapping):
```
$ terraform output test_out 1
world
```
Along with maps, list outputs from one module may be passed as variables
into another, removing the need for the `join(",", var.list_as_string)`
and `split(",", var.list_as_string)` which was previously necessary in
Terraform configuration.
This commit also updates the tests and implementations of built-in
interpolation functions to take and return native lists where
appropriate.
A backwards compatibility note: previously the concat interpolation
function was capable of concatenating either strings or lists. The
strings use case was deprectated a long time ago but still remained.
Because we cannot return `ast.TypeAny` from an interpolation function,
this use case is no longer supported for strings - `concat` is only
capable of concatenating lists. This should not be a huge issue - the
type checker picks up incorrect parameters, and the native HIL string
concatenation - or the `join` function - can be used to replicate the
missing behaviour.
For a long time now, the diff logic has relied on the behavior of
`mapstructure.WeakDecode` to determine how various primitives are
converted into strings. The `schema.DiffString` function is used for
all primitive field types: TypeBool, TypeInt, TypeFloat, and TypeString.
The `mapstructure` library's string representation of booleans is "0"
and "1", which differs from `strconv.FormatBool`'s "false" and "true"
(which is used in writing out boolean fields to the state).
Because of this difference, diffs have long had the potential for
cosmetically odd but semantically neutral output like:
"true" => "1"
"false" => "0"
So long as `mapstructure.Decode` or `strconv.ParseBool` are used to
interpret these strings, there's no functional problem.
We had our first clear functional problem with #6005 and friends, where
users noticed diffs like the above showing up unexpectedly and causing
troubles when `ignore_changes` was in play.
This particular bug occurs down in Terraform core's EvalIgnoreChanges.
There, the diff is modified to account for ignored attributes, and
special logic attempts to handle properly the situation where the
ignored attribute was going to trigger a resource replacement. That
logic relies on the string representations of the Old and New fields in
the diff to be the same so that it filters properly.
So therefore, we now get a bug when a diff includes `Old: "0", New:
"false"` since the strings do not match, and `ignore_changes` is not
properly handled.
Here, we introduce `TypeBool`-specific normalizing into `finalizeDiff`.
I spiked out a full `diffBool` function, but figuring out which pieces
of `diffString` to duplicate there got hairy. This seemed like a simpler
and more direct solution.
Fixes#6005 (and potentially others!)
* MaxItems defines a maximum amount of items that can exist within a
TypeSet or TypeList. Specific use cases would be if a TypeSet is being
used to wrap a complex structure, however more than one instance would
cause instability.
It was a mistake to switched fully to `==` when activating waiting for
capacity on updates in #3947. Users that didn't set `min_elb_capacity ==
desired_capacity` and instead treated it as an actual "minimum" would
see timeouts for every create, since their target numbers would never be
reached exactly.
Here, we fix that regression by restoring the minimum waiting behavior
during creates.
In order to preserve all the stated behavior, I had to split out
different criteria for create and update, criteria which are now
exhaustively unit tested.
The set of fields that affect capacity waiting behavior has become a bit
of a mess. Next major release I'd like to rework all of these into a
more consistently named block of config. For now, just getting the
behavior correct and documented.
(Also removes all the fixed names from the ASG tests as I was hitting
collision issues running them over here.)
Fixes#4792
Implementation notes:
* The hash implementation was not considering key value, causing "diffs
did not match" errors when a value was updated. Switching to default
HashResource implementation fixes this
* Using HashResource as a default exposed a bug in helper/schema that
needed to be fixed so the Set function is picked up properly during
Read
* Stop writing back values into the `key` attribute; it triggers extra
diffs when `default` is used. Computed values all just go into `var`.
* Includes a state migration to prevent unnecessary diffs based on
"key" attribute hashcodes changing.
In the tests:
* Switch from leaning on the public demo Consul instance to requiring a
CONSUL_HTTP_ADDR variable be set pointing to a `consul agent -dev`
instance to be used only for testing.
* Add a test that exposes the updating issues and covers the fixes
Fixes#774Fixes#1866Fixes#3023
Changing the Set internals makes a lot of sense as it saves doing
conversions in multiple places and gives a central place to alter
the key when a item is computed.
This will have no side effects other then that the ordering is now
based on strings instead on integers, so the order will be different.
This will however have no effect on existing configs as these will
use the individual codes/keys and not the ordering to determine if
there is a diff or not.
Lastly (but I think also most importantly) there is a fix in this PR
that makes diffing sets extremely more performand. Before a full diff
required reading the complete Set for every single parameter/attribute
you wanted to diff, while now it only gets that specific parameter.
We have a use case where we have a Set that has 18 parameters and the
set consist of about 600 items (don't ask 😉). So when doing a diff
it would take 100% CPU of all cores and stay that way for almost an
hour before being able to complete the diff.
Debugging this we learned that for retrieving every single parameter
it made over 52.000 calls to `func (c *ResourceConfig) get(..)`. In
this function a slice is created and used only for the duration of the
call, so the time needed to create all needed slices and on the other
hand the time the garbage collector needed to clean them up again caused
the system to cripple itself. Next to that there are also some expensive
reflect calls in this function which also claimed a fair amount of CPU
time.
After this fix the number of calls needed to get a single parameter
dropped from 52.000+ to only 2! 😃