This package will do all of its work in-memory so that it can avoid making
partial updates and then failing, so we need to be able to load the
sources files from a particular directory into memory.
The upgrade process isn't idempotent, so we also attempt to detect
heuristically whether an upgrade has already been performed (can parse
with the new parser and has a version constraint that prevents versions
earlier than 0.12) so that the CLI tool that will eventually wrap this
will be able to produce a warning and prompt for confirmation in that
case.
Although the new HCL implementation and configuration loader is broadly
compatible with the prior implementation, it has a number of new idiomatic
forms and it also cannot parse some more extreme non-idiomatic usages
that were possible under the old parser.
To help users migrate to the new implementation, this package will rewrite
configuration to comply with the new idiom and fix as many cases as
possible where the legacy parser was too liberal or exposed implementation
details.
It is common for the same module source package to be referenced multiple
times in the same configuration, either because there are literally
multiple instances of the same module source or because a single package
(or repository) contains multiple modules in sub-directories and many
of them are referenced.
To optimize this, here we introduce a simple caching behavior where the
module installer will detect if it's asked to install multiple times from
the same source and produce the second and subsequent directories by
copying the first, rather than by downloading again over the network.
This optimization is applied once all of the go-getter detection has
completed and sub-directory portions have been trimmed, so it is also
able to normalize differently-specified source addresses that all
ultimately detect to the same resolved address. When installing, we
always extract the entire specified package (or repository) and then
reference the specified sub-directory, so we can safely re-use existing
directories when the base package is the same, even if the sub-directory
is different.
However, as a result we do not yet address the fact that the same package
will be stored multiple times _on disk_, which may still be problematic
when referencing large repositories multiple times in
disk-storage-constrained environments. We could address this in a
subsequent change by investigating the use of symlinks where possible.
Since the Registry installer is implemented just as an extra resolution
step in front of go-getter, this optimization applies to registry
modules too. This does not apply to local relative references, which will
continue to just resolve into the already-prepared directory of their
parent module.
The cache of previously installed paths lives only for the duration of
one call to InstallModules, so we will never re-use directories that
were created by previous runs of "terraform init" and there is no risk
that older versions will pollute the cache when attempting an upgrade
from a source address that doesn't explicitly specify a version.
No additional tests are added here because the existing module installer
tests (when TF_ACC=1) already cover the case of installing multiple
modules from the same source.
Here we introduce a new idea of a "configuration snapshot", which is an
in-memory copy of the source code of each of the files that make up
the configuration. The primary intended purpose for this is as an
intermediate step before writing the configuration files into a plan file,
and then reading them out when that plan file is later applied.
During earlier configs package development we expected to use an afero vfs
implementation to read directly from the zip file, but that doesn't work
in practice because we need to preserve module paths from the source file
system that might include parent directory traversals (../) while
retaining the original path for use in error messages.
The result, for now, is a bit of an abstraction inversion: we implement
a specialized afero vfs implementation that makes the sparse filesystem
representation from a snapshot appear like a normal filesystem just well
enough that the config loader and parser can work with it.
In future we may wish to rework the internals here so that the main
abstraction is at a similar level to the snapshot and then that API is
mapped to the native filesystem in the normal case, removing afero. For
now though, this approach avoids the need for a significant redesign
of the parser/loader internals, at the expense of some trickiness in the
case where we're reading from a snapshot.
This commit does not yet include the reading and writing of snapshots into
plan files. That will follow in a subsequent commit.
This is important in particular for shimming the "providers" map in module
blocks:
providers = {
"aws" = "aws.foo"
}
We call this shim for both the key and the value here, and the value would
previously have worked. However, the key is wrapped up by the parser in
an ObjectConsKeyExpr container, which deals with the fact that in normal
use an object constructor key that is just a bare identifier is actually
interpreted as a string. We don't care about that interpretation for our
shimming purposes, and so we can just unwrap it here.
Throughout the main "terraform" package we identify resources using the
address types, and so this helper is useful to make concise transitions
between the address types and the configuration types.
As part of this, we use the address types to produce the keys used in our
resource maps. This has no visible change in behavior since the prior
implementation produced an equal result, but this change ensures that
ResourceByAddr cannot be broken by hypothetical future changes to the
key serialization.
We can only do this when modules are loaded with Parser.LoadConfigDir,
but in practice this is the common case anyway.
This is important to support the path.module and path.root expressions in
configuration.
addrs.Module is itself internally just []string, but this better
communicates our intent here and makes this integrate better with other
code which is using this type for this purposes.
Our new "addrs" package gives us some nice representations of various
kinds of "address" within Terraform. To talk to APIs that use these, it's
convenient to be able to easily derive such addresses from the
configuration objects.
These new methods, along with a recasting of the existing
Resource.ProviderConfigKey method to Resource.ProviderConfigAddr, give us
some key integration points to support the configuration graph transforms
in the main "terraform" package.
This was accidentally missed on the first pass of module call decoding.
As before, this is a map from child provider config address to parent
provider config address, allowing the set of providers to be projected in
arbitrary ways into a child module.
This is a built-in implementation of ModuleWalker that just returns an
error any time it's asked for a module. This is intended for simple unit
tests where no child modules are needed anyway.
This is useful for creating a valid placeholder configuration, but not
much else. Most callers should use BuildConfig to build a configuration
that actually has something in it.
Initially the intent here was to tease these apart a little more since
they don't really share much behavior in common in core, but in practice
it'll take a lot of refactoring to tease apart these assumptions in core
right now and so we'll keep these things unified at the configuration
layer in the interests of minimizing disruption at the core layer.
The two types are still kept in separate maps to help reinforce the fact
that they are separate concepts with some behaviors in common, rather than
the same concept.
We initially just mimicked our old practice of using []string for module
paths here, but the addrs package now gives us a pair of types that better
capture the two different kinds of module addresses we are dealing with:
static addresses (nodes in the configuration tree) and dynamic/instance
addresses (which can represent the situation where multiple instances are
created from a single module call).
This distinction still remains rather artificial since we don't yet have
support for count or for_each on module calls, but this is intended to lay
the foundations for that to be added later, and in the mean time just
gives us some handy helper functions for parsing and formatting these
address types.
For the moment this is just a lightly-adapted copy of
ModuleTreeDependencies named ConfigTreeDependencies, with the goal that
the two can live concurrently for the moment while not all callers are yet
updated and then we can drop ModuleTreeDependencies and its helper
functions altogether in a later commit.
This can then be used to make "terraform init" and "terraform providers"
work properly with the HCL2-powered configuration loader.
This is a rather-messy, complex change to get the "command" package
building again against the new backend API that was updated for
the new configuration loader.
A lot of this is mechanical rewriting to the new API, but
meta_config.go and meta_backend.go in particular saw some major
changes to interface with the new loader APIs and to deal with
the change in order of steps in the backend API.
These utility functions are intended to allow concisely loading a
configuration from a fixture directory in a test, bailing out early if
there are any unexpected errors.
We have a few special use-cases in Terraform where an object is
constructed from a mixture of different sources, such as a configuration
file, command line arguments, and environment variables.
To represent this within the HCL model, we introduce a new "synthetic"
HCL body type that just represents a map of values that are interpreted
as attributes.
We then export the previously-private MergeBodies function to allow the
synthetic body to be used as an override for a "real" body, which then
allows us to combine these various sources together while still retaining
the proper source location information for each individual attribute.
Since a synthetic body doesn't actually exist in configuration, it does
not produce source locations that can be turned into source snippets but
we can still use placeholder strings to help the user to understand
which of the many different sources a particular value came from.
By adding this method you now only have to pass a `*disco.Disco` object around in order to do discovery and use any configured credentials for the discovered hosts.
Of course you can also still pass around both a `*disco.Disco` and a `auth.CredentialsSource` object if there is a need or a reason for that!
Previously we just ported over the simple "string", "list", and "map" type
hint keywords from the old loader, which exist primarily as hints to the
CLI for whether to treat -var=... arguments and environment variables as
literal strings or as HCL expressions.
However, we've been requested before to allow more specific constraints
here because it's generally better UX for a type error to be detected
within an expression in a calling "module" block rather than at some point
deep inside a third-party module.
To allow for more specific constraints, here we use the type constraint
expression syntax defined as an extension within HCL, which uses the
variable and function call syntaxes to represent types rather than values,
like this:
- string
- number
- bool
- list(string)
- list(any)
- list(map(string))
- object({id=string,name=string})
In native HCL syntax this looks like:
variable "foo" {
type = map(string)
}
In JSON, this looks like:
{
"variable": {
"foo": {
"type": "map(string)"
}
}
}
The selection of literal processing or HCL parsing of CLI-set values is
now explicit in the model and separate from the type, though it's still
derived from the type constraint and thus not directly controllable in
configuration.
Since this syntax is more complex than the keywords that replaced it, for
now the simpler keywords are still supported and "list" and "map" are
interpreted as list(any) and map(any) respectively, mimicking how they
were interpreted by Terraform 0.11 and earlier. For the time being our
documentation should continue to recommend these shorthand versions until
we gain more experience with the more-specific type constraints; most
users should just make use of the additional primitive type constraints
this enables: bool and number.
As a result of these more-complete type constraints, we can now type-check
the default value at config load time, which has the nice side-effect of
allowing us to produce a tailored error message if an override file
produces an invalid situation; previously the result was rather confusing
because the error message referred to the original definition of the
variable and not the overridden parts.
Although we do still consider these deprecated for 0.12, we'll defer
actually generating warnings for them until a later minor release so that
module authors can retain their quoted identifiers for a period after 0.12
release for backward-compatibility with Terraform 0.11.
The error-handling behavior of the HCL parser was improved, which causes
the number of diagnostics and the diagnostics messages to be different
in cases where a block-like introduction is given but without any
following body.
The initial pass of implementation here missed the special case where
ignore_changes can, in the old parser, be set to ["*"] to ignore changes
to all attributes.
Since that syntax is awkward and non-obvious, our new decoder will instead
expect ignore_changes = all, using HCL2's capability to interpret an
expression as a literal keyword. For compatibility with old configurations
we will still accept the ["*"] form but emit a deprecation warning to
encourage moving to the new form.
In our new loader we are changing certain values in configuration to be
naked keywords or references rather than quoted strings as before. Since
many of these have been shown in books, tutorials, and our own
documentation we will make the old forms generate deprecation warnings
rather than errors so that newcomers starting from older documentation
can be eased into the new syntax, rather than getting blocked.
This will also avoid creating a hard compatibility wall for reusable
modules that are already published, allowing them to still be used in
spite of these warnings and then fixed when the maintainer is able.
Previously we were just loading the module and asserting no diagnostics,
but that is not really good enough since if we install modules incorrectly
it's possible that we are still able to load an empty configuration
successfully.
Now we'll do some basic inspecetion of the module tree that results from
loading what we installed, to ensure that all of the expected modules
are present at the right locations in the tree.
This will provide the functionality of "terraform init -from-module=...",
which uses the contents of a given module to populate the working
directory.
This mechanism is intended for installing e.g. examples from Terraform
Registry or elsewhere. It's not fully-general since it can't reasonably
install a module from a subdir that refers up to a parent directory, but
that isn't an issue for all reasonable uses of this option.
Originally the hope was to use the afero filesystem abstraction for all
loader operations, but since we install modules using go-getter we cannot
(without a lot of refactoring) support vfs for installation.
The vfs use-case is for reading configuration from plan zip files anyway,
and so we have no real reason to support installation into a vfs. For now
at least we will just add the possibility that a loader might not be
install-capable. At the moment we have no non-install-capable loaders, but
we'll add one later once we get to loading configuration from plan files.
Unlike the old installer in config/module, this uses new-style
installation directories that include the static module path so that paths
we show in diagnostics will be more meaningful to the user.
As before, we retrieve the entire "package" associated with the given
source string, rather than any given subdirectory directly, because the
retrieved module may contain ../ references into parent directories which
must be resolvable after extraction.
This is not strictly necessary, but since this is not a
performance-critical codepath we'll do this because it makes life easier
for callers that want to print out user-facing logs about build process,
or who are logging actions taken as part of a unit test.
Enough of the InstallModules method to install local modules (those with
relative paths). "Install" is actually a bit of an exaggeration for these
since we actually just record them in our manifest after verifying that
the source directory exists.
This is a change of behavior relative to the old module installer since
we no longer create a symlink to the module directory inside the
.terraform/modules directory. Instead, we record the module's true
location in our manifest so that the loader will find it later.
The use of a symlink here predated the manifest file. Now that we have a
manifest file the symlinks are redundant. Using the "natural" location of
the module leads to more helpful error messages, since we'll refer to
the module path as the user expects it, rather than to an internal alias.
Previously the behavior for loading and installing modules was included in
the same package as the representation of the module tree (in the
config/module package).
In our new world, the model of a module tree (now called a "Config") is
included in "configs" along with the Module and File structs. This new
package replaces the loading and installation functionality previously
in config/module with new equivalents that work with the model objects
in "configs".
As of this commit, only the loading functionality is implemented. The
installation functionality will follow in subsequent commits.
BuildConfig creates a module tree by recursively walking through module
calls in the root module and any descendent modules. This is intended to
be used both for the simple case of loading already-installed modules and
the more complex case of installing modules inside "terraform init", both
of which will be dealt with in a separate package.
mergeBody is a hcl.Body implementation that deals with our override file
merging behavior for the portions of the configuration that are not
processed until full eval time.
Mimicking the behavior of our old config merge implementation from the
"config" package, the rules here are:
- Attributes in the override body hide attributes of the same name in
the base body.
- Any block in the override body hides all blocks with the same type name
that appear in the base body.
This is tested by a new test for the overriding of module arguments, which
asserts the correct behavior of the merged body as part of its work.
Some of the fields in our config structs are either mandatory in primary
files or there is a default value that we apply if absent.
Unfortunately override files impose the additional constraint that we
be allowed to omit required fields (which have presumably already been
set in the primary files) and that we are able to distinguish between a
default value and omitting a value entirely.
Since most of our fields were already acceptable for override files, here
we just add some new fields to deal with the few cases where special
handling is required and a helper function to disable the "Required" flag
on attributes in a given schema.
This method wraps LoadConfigFile to load all of the .tf and .tf.json files
in a given directory and then bundle them together into a Module object.
This function also deals with the distinction between primary and override
files, first appending together the primary files in lexicographic order
by filename, and then merging in override files in the same order.
The merging behavior is not fully implemented as of this commit, and so
will be expanded in future commits.
Much like TestParserLoadConfigFileSuccess, this is intended to be an
easy-to-maintain collection of bad examples to test different permutations
of our error handling.
As with TestParserLoadConfigFileSuccess, we should also have more specific
tests alongside this that check that the error outcome is what was
expected, since this test just accepts any error and may thus not be
testing what we think it is.
This test is intended to be an easy-to-maintain catalog of good examples
that we can use to catch certain parsing or decoding regressions easily.
It's not a fully-comprehensive test since it doesn't check the result
of decoding, instead just accepting any decode that completes without
errors. However, an easy-to-maintain test like this is a good complement
to some more specialized tests since we can easily collect good examples
over time and just add them in here.
This is a first pass of decoding of the main Terraform configuration file
format. It hasn't yet been tested with any real-world configurations, so
it will need to be revised further as we test it more thoroughly.
These types represent the individual elements within configuration, the
modules a configuration is made of, and the configuration (static module
tree) itself.
This method loads a "values file" -- also known as a "tfvars file" -- and
returns the values found inside.
A values file is an HCL file (in either native or JSON syntax) whose
top-level body is treated as a set of arbitrary key/value pairs whose
values may not depend on any variables or functions.
We will load values files through a configs.Parser -- even though values
files are not strictly-speaking part of configuration -- because this
causes them to be registered in our source code cache so that we can
generate source code snippets if we need to report any diagnostics.
configs.Parser is the entry-point for this package, providing functions to
load and parse HCL-based configuration files.
We use the library "afero" to decouple the parser from the physical OS
filesystem, which here allows us to easily use an in-memory filesystem
for testing and will, in future, allow us to read files from more unusual
places, such as configuration embedded in a plan file.
There's a lot of complexity in our existing "config" package that results
from our approach to handling configuration with HCL and HIL. A lot of
that functionality is no longer needed -- or must work in a significantly
different way -- for HCL2.
The new package "configs", which is named following the convention of some
Go standard library packages like "strings", is a re-imagination of some
of the functionality from the "config" package for an HCL2-only world.
The scope of this package will be slightly smaller than "config", since
it only deals with config loading and not with expression evaluation.
Another package "lang" (mentioned in the docstring here but not yet added)
will deal with the more dynamic portions of of configuration handling,
including populating an hcl.EvalContext to evaluate expressions.