2019-03-05 21:18:40 +01:00
|
|
|
---
|
2021-12-15 03:41:17 +01:00
|
|
|
page_title: Module Composition
|
2019-03-05 21:18:40 +01:00
|
|
|
description: |-
|
|
|
|
Module composition allows infrastructure to be described from modular
|
|
|
|
building blocks.
|
|
|
|
---
|
|
|
|
|
|
|
|
# Module Composition
|
|
|
|
|
|
|
|
In a simple Terraform configuration with only one root module, we create a
|
|
|
|
flat set of resources and use Terraform's expression syntax to describe the
|
|
|
|
relationships between these resources:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
resource "aws_vpc" "example" {
|
|
|
|
cidr_block = "10.1.0.0/16"
|
|
|
|
}
|
|
|
|
|
|
|
|
resource "aws_subnet" "example" {
|
|
|
|
vpc_id = aws_vpc.example.id
|
|
|
|
|
|
|
|
availability_zone = "us-west-2b"
|
|
|
|
cidr_block = cidrsubnet(aws_vpc.example.cidr_block, 4, 1)
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2020-07-23 15:09:22 +02:00
|
|
|
When we introduce `module` blocks, our configuration becomes hierarchical
|
2019-03-05 21:18:40 +01:00
|
|
|
rather than flat: each module contains its own set of resources, and possibly
|
|
|
|
its own child modules, which can potentially create a deep, complex tree of
|
|
|
|
resource configurations.
|
|
|
|
|
|
|
|
However, in most cases we strongly recommend keeping the module tree flat,
|
|
|
|
with only one level of child modules, and use a technique similar to the
|
|
|
|
above of using expressions to describe the relationships between the modules:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
module "network" {
|
|
|
|
source = "./modules/aws-network"
|
|
|
|
|
|
|
|
base_cidr_block = "10.0.0.0/8"
|
|
|
|
}
|
|
|
|
|
|
|
|
module "consul_cluster" {
|
|
|
|
source = "./modules/aws-consul-cluster"
|
|
|
|
|
|
|
|
vpc_id = module.network.vpc_id
|
|
|
|
subnet_ids = module.network.subnet_ids
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
We call this flat style of module usage _module composition_, because it
|
|
|
|
takes multiple [composable](https://en.wikipedia.org/wiki/Composability)
|
|
|
|
building-block modules and assembles them together to produce a larger system.
|
|
|
|
Instead of a module _embedding_ its dependencies, creating and managing its
|
|
|
|
own copy, the module _receives_ its dependencies from the root module, which
|
|
|
|
can therefore connect the same modules in different ways to produce different
|
|
|
|
results.
|
|
|
|
|
|
|
|
The rest of this page discusses some more specific composition patterns that
|
|
|
|
may be useful when describing larger systems with Terraform.
|
|
|
|
|
|
|
|
## Dependency Inversion
|
|
|
|
|
|
|
|
In the example above, we saw a `consul_cluster` module that presumably describes
|
|
|
|
a cluster of [HashiCorp Consul](https://www.consul.io/) servers running in
|
|
|
|
an AWS VPC network, and thus it requires as arguments the identifiers of both
|
|
|
|
the VPC itself and of the subnets within that VPC.
|
|
|
|
|
|
|
|
An alternative design would be to have the `consul_cluster` module describe
|
|
|
|
its _own_ network resources, but if we did that then it would be hard for
|
|
|
|
the Consul cluster to coexist with other infrastructure in the same network,
|
|
|
|
and so where possible we prefer to keep modules relatively small and pass in
|
|
|
|
their dependencies.
|
|
|
|
|
|
|
|
This [dependency inversion](https://en.wikipedia.org/wiki/Dependency_inversion_principle)
|
|
|
|
approach also improves flexibility for future
|
|
|
|
refactoring, because the `consul_cluster` module doesn't know or care how
|
|
|
|
those identifiers are obtained by the calling module. A future refactor may
|
|
|
|
separate the network creation into its own configuration, and thus we may
|
|
|
|
pass those values into the module from data sources instead:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
data "aws_vpc" "main" {
|
2019-10-24 17:23:17 +02:00
|
|
|
tags = {
|
2019-03-05 21:18:40 +01:00
|
|
|
Environment = "production"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
data "aws_subnet_ids" "main" {
|
|
|
|
vpc_id = data.aws_vpc.main.id
|
|
|
|
}
|
|
|
|
|
|
|
|
module "consul_cluster" {
|
|
|
|
source = "./modules/aws-consul-cluster"
|
|
|
|
|
|
|
|
vpc_id = data.aws_vpc.main.id
|
|
|
|
subnet_ids = data.aws_subnet_ids.main.ids
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2019-06-18 23:03:02 +02:00
|
|
|
### Conditional Creation of Objects
|
|
|
|
|
|
|
|
In situations where the same module is used across multiple environments,
|
|
|
|
it's common to see that some necessary object already exists in some
|
|
|
|
environments but needs to be created in other environments.
|
|
|
|
|
|
|
|
For example, this can arise in development environment scenarios: for cost
|
|
|
|
reasons, certain infrastructure may be shared across multiple development
|
|
|
|
environments, while in production the infrastructure is unique and managed
|
|
|
|
directly by the production configuration.
|
|
|
|
|
2019-10-28 23:19:26 +01:00
|
|
|
Rather than trying to write a module that itself tries to detect whether something
|
2019-06-18 23:03:02 +02:00
|
|
|
exists and create it if not, we recommend applying the dependency inversion
|
|
|
|
approach: making the module accept the object it needs as an argument, via
|
|
|
|
an input variable.
|
|
|
|
|
|
|
|
For example, consider a situation where a Terraform module deploys compute
|
|
|
|
instances based on a disk image, and in some environments there is a
|
|
|
|
specialized disk image available while other environments share a common
|
|
|
|
base disk image. Rather than having the module itself handle both of these
|
|
|
|
scenarios, we can instead declare an input variable for an object representing
|
|
|
|
the disk image. Using AWS EC2 as an example, we might declare a common subtype
|
|
|
|
of the `aws_ami` resource type and data source schemas:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
variable "ami" {
|
|
|
|
type = object({
|
|
|
|
# Declare an object using only the subset of attributes the module
|
|
|
|
# needs. Terraform will allow any object that has at least these
|
|
|
|
# attributes.
|
|
|
|
id = string
|
|
|
|
architecture = string
|
|
|
|
})
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
The caller of this module can now itself directly represent whether this is
|
|
|
|
an AMI to be created inline or an AMI to be retrieved from elsewhere:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
# In situations where the AMI will be directly managed:
|
|
|
|
|
|
|
|
resource "aws_ami_copy" "example" {
|
|
|
|
name = "local-copy-of-ami"
|
|
|
|
source_ami_id = "ami-abc123"
|
|
|
|
source_ami_region = "eu-west-1"
|
|
|
|
}
|
|
|
|
|
|
|
|
module "example" {
|
|
|
|
source = "./modules/example"
|
|
|
|
|
|
|
|
ami = aws_ami_copy.example
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
# Or, in situations where the AMI already exists:
|
|
|
|
|
|
|
|
data "aws_ami" "example" {
|
|
|
|
owner = "9999933333"
|
|
|
|
|
|
|
|
tags = {
|
|
|
|
application = "example-app"
|
|
|
|
environment = "dev"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
module "example" {
|
|
|
|
source = "./modules/example"
|
|
|
|
|
|
|
|
ami = data.aws_ami.example
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
This is consistent with Terraform's declarative style: rather than creating
|
2019-07-17 09:11:49 +02:00
|
|
|
modules with complex conditional branches, we directly describe what
|
2019-06-18 23:03:02 +02:00
|
|
|
should already exist and what we want Terraform to manage itself.
|
|
|
|
|
|
|
|
By following this pattern, we can be explicit about in which situations we
|
|
|
|
expect the AMI to already be present and which we don't. A future reader
|
|
|
|
of the configuration can then directly understand what it is intending to do
|
|
|
|
without first needing to inspect the state of the remote system.
|
|
|
|
|
|
|
|
In the above example, the object to be created or read is simple enough to
|
|
|
|
be given inline as a single resource, but we can also compose together multiple
|
|
|
|
modules as described elsewhere on this page in situations where the
|
|
|
|
dependencies themselves are complicated enough to benefit from abstractions.
|
2019-03-05 21:18:40 +01:00
|
|
|
|
|
|
|
## Multi-cloud Abstractions
|
|
|
|
|
|
|
|
Terraform itself intentionally does not attempt to abstract over similar
|
|
|
|
services offered by different vendors, because we want to expose the full
|
|
|
|
functionality in each offering and yet unifying multiple offerings behind a
|
|
|
|
single interface will tend to require a "lowest common denominator" approach.
|
|
|
|
|
|
|
|
However, through composition of Terraform modules it is possible to create
|
|
|
|
your own lightweight multi-cloud abstractions by making your own tradeoffs
|
|
|
|
about which platform features are important to you.
|
|
|
|
|
|
|
|
Opportunities for such abstractions arise in any situation where multiple
|
|
|
|
vendors implement the same concept, protocol, or open standard. For example,
|
|
|
|
the basic capabilities of the domain name system are common across all vendors,
|
|
|
|
and although some vendors differentiate themselves with unique features such
|
|
|
|
as geolocation and smart load balancing, you may conclude that in your use-case
|
|
|
|
you are willing to eschew those features in return for creating modules that
|
|
|
|
abstract the common DNS concepts across multiple vendors:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
module "webserver" {
|
|
|
|
source = "./modules/webserver"
|
|
|
|
}
|
|
|
|
|
|
|
|
locals {
|
|
|
|
fixed_recordsets = [
|
|
|
|
{
|
|
|
|
name = "www"
|
|
|
|
type = "CNAME"
|
|
|
|
ttl = 3600
|
|
|
|
records = [
|
|
|
|
"webserver01",
|
|
|
|
"webserver02",
|
|
|
|
"webserver03",
|
|
|
|
]
|
|
|
|
},
|
|
|
|
]
|
|
|
|
server_recordsets = [
|
|
|
|
for i, addr in module.webserver.public_ip_addrs : {
|
|
|
|
name = format("webserver%02d", i)
|
|
|
|
type = "A"
|
|
|
|
records = [addr]
|
|
|
|
}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
|
|
|
|
module "dns_records" {
|
|
|
|
source = "./modules/route53-dns-records"
|
|
|
|
|
|
|
|
route53_zone_id = var.route53_zone_id
|
|
|
|
recordsets = concat(local.fixed_recordsets, local.server_recordsets)
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
In the above example, we've created a lightweight abstraction in the form of
|
|
|
|
a "recordset" object. This contains the attributes that describe the general
|
|
|
|
idea of a DNS recordset that should be mappable onto any DNS provider.
|
|
|
|
|
|
|
|
We then instantiate one specific _implementation_ of that abstraction as a
|
|
|
|
module, in this case deploying our recordsets to Amazon Route53.
|
|
|
|
|
|
|
|
If we later wanted to switch to a different DNS provider, we'd need only to
|
|
|
|
replace the `dns_records` module with a new implementation targeting that
|
|
|
|
provider, and all of the configuration that _produces_ the recordset
|
|
|
|
definitions can remain unchanged.
|
|
|
|
|
|
|
|
We can create lightweight abstractions like these by defining Terraform object
|
|
|
|
types representing the concepts involved and then using these object types
|
|
|
|
for module input variables. In this case, all of our "DNS records"
|
|
|
|
implementations would have the following variable declared:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
variable "recordsets" {
|
2019-04-18 00:45:35 +02:00
|
|
|
type = list(object({
|
2019-03-05 21:18:40 +01:00
|
|
|
name = string
|
|
|
|
type = string
|
|
|
|
ttl = number
|
|
|
|
records = list(string)
|
2019-04-18 00:45:35 +02:00
|
|
|
}))
|
2019-03-05 21:18:40 +01:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
While DNS serves as a simple example, there are many more opportunities to
|
|
|
|
exploit common elements across vendors. A more complex example is Kubernetes,
|
|
|
|
where there are now many different vendors offering hosted Kubernetes clusters
|
|
|
|
and even more ways to run Kubernetes yourself.
|
|
|
|
|
|
|
|
If the common functionality across all of these implementations is sufficient
|
|
|
|
for your needs, you may choose to implement a set of different modules that
|
|
|
|
describe a particular Kubernetes cluster implementation and all have the common
|
|
|
|
trait of exporting the hostname of the cluster as an output value:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
output "hostname" {
|
|
|
|
value = azurerm_kubernetes_cluster.main.fqdn
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
You can then write _other_ modules that expect only a Kubernetes cluster
|
2019-03-21 20:20:29 +01:00
|
|
|
hostname as input and use them interchangeably with any of your Kubernetes
|
2019-03-05 21:18:40 +01:00
|
|
|
cluster modules:
|
|
|
|
|
|
|
|
```hcl
|
|
|
|
module "k8s_cluster" {
|
|
|
|
source = "modules/azurerm-k8s-cluster"
|
|
|
|
|
|
|
|
# (Azure-specific configuration arguments)
|
|
|
|
}
|
|
|
|
|
|
|
|
module "monitoring_tools" {
|
|
|
|
source = "modules/monitoring_tools"
|
|
|
|
|
|
|
|
cluster_hostname = module.k8s_cluster.hostname
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
## Data-only Modules
|
|
|
|
|
|
|
|
Most modules contain `resource` blocks and thus describe infrastructure to be
|
|
|
|
created and managed. It may sometimes be useful to write modules that do not
|
|
|
|
describe any new infrastructure at all, but merely retrieve information about
|
|
|
|
existing infrastructure that was created elsewhere using
|
2021-12-15 03:41:17 +01:00
|
|
|
[data sources](/language/data-sources).
|
2019-03-05 21:18:40 +01:00
|
|
|
|
|
|
|
As with conventional modules, we suggest using this technique only when the
|
|
|
|
module raises the level of abstraction in some way, in this case by
|
|
|
|
encapsulating exactly how the data is retrieved.
|
|
|
|
|
|
|
|
A common use of this technique is when a system has been decomposed into several
|
|
|
|
subsystem configurations but there is certain infrastructure that is shared
|
|
|
|
across all of the subsystems, such as a common IP network. In this situation,
|
|
|
|
we might write a shared module called `join-network-aws` which can be called
|
|
|
|
by any configuration that needs information about the shared network when
|
|
|
|
deployed in AWS:
|
|
|
|
|
|
|
|
```
|
|
|
|
module "network" {
|
|
|
|
source = "./modules/join-network-aws"
|
|
|
|
|
|
|
|
environment = "production"
|
|
|
|
}
|
|
|
|
|
|
|
|
module "k8s_cluster" {
|
|
|
|
source = "./modules/aws-k8s-cluster"
|
|
|
|
|
|
|
|
subnet_ids = module.network.aws_subnet_ids
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
The `network` module itself could retrieve this data in a number of different
|
|
|
|
ways: it could query the AWS API directly using
|
2020-12-17 00:17:27 +01:00
|
|
|
[`aws_vpc`](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/vpc)
|
2019-03-05 21:18:40 +01:00
|
|
|
and
|
2020-12-17 00:17:27 +01:00
|
|
|
[`aws_subnet_ids`](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/subnet_ids)
|
2019-03-05 21:18:40 +01:00
|
|
|
data sources, or it could read saved information from a Consul cluster using
|
2020-12-17 00:17:27 +01:00
|
|
|
[`consul_keys`](https://registry.terraform.io/providers/hashicorp/consul/latest/docs/data-sources/keys),
|
2019-03-05 21:18:40 +01:00
|
|
|
or it might read the outputs directly from the state of the configuration that
|
|
|
|
manages the network using
|
2021-12-15 03:41:17 +01:00
|
|
|
[`terraform_remote_state`](/language/state/remote-state-data).
|
2019-03-05 21:18:40 +01:00
|
|
|
|
|
|
|
The key benefit of this approach is that the source of this information can
|
|
|
|
change over time without updating every configuration that depends on it.
|
|
|
|
Furthermore, if you design your data-only module with a similar set of outputs
|
|
|
|
as a corresponding management module, you can swap between the two relatively
|
|
|
|
easily when refactoring.
|