website: internals
This commit is contained in:
parent
9ff8856fe8
commit
d5049b65da
|
@ -1,117 +0,0 @@
|
||||||
---
|
|
||||||
layout: "docs"
|
|
||||||
page_title: "Terraform Architecture"
|
|
||||||
sidebar_current: "docs-internals-architecture"
|
|
||||||
---
|
|
||||||
|
|
||||||
# Terraform Architecture
|
|
||||||
|
|
||||||
Terraform is a complex system that has many different moving parts. To help
|
|
||||||
users and developers of Terraform form a mental model of how it works, this
|
|
||||||
page documents the system architecture.
|
|
||||||
|
|
||||||
<div class="alert alert-block alert-warning">
|
|
||||||
<strong>Advanced Topic!</strong> This page covers technical details of
|
|
||||||
the internals of Terraform. You don't need to know these details to effectively
|
|
||||||
operate and use Terraform. These details are documented here for those who wish
|
|
||||||
to learn about them without having to go spelunking through the source code.
|
|
||||||
</div>
|
|
||||||
|
|
||||||
## Glossary
|
|
||||||
|
|
||||||
Before describing the architecture, we provide a glossary of terms to help
|
|
||||||
clarify what is being discussed:
|
|
||||||
|
|
||||||
* Agent - An agent is the long running daemon on every member of the Terraform cluster.
|
|
||||||
It is started by running `terraform agent`. The agent is able to run in either *client*,
|
|
||||||
or *server* mode. Since all nodes must be running an agent, it is simpler to refer to
|
|
||||||
the node as either being a client or server, but there are other instances of the agent. All
|
|
||||||
agents can run the DNS or HTTP interfaces, and are responsible for running checks and
|
|
||||||
keeping services in sync.
|
|
||||||
|
|
||||||
* Client - A client is an agent that forwards all RPCs to a server. The client is relatively
|
|
||||||
stateless. The only background activity a client performs is taking part of LAN gossip pool.
|
|
||||||
This has a minimal resource overhead and consumes only a small amount of network bandwidth.
|
|
||||||
|
|
||||||
* Server - An agent that is server mode. When in server mode, there is an expanded set
|
|
||||||
of responsibilities including participating in the Raft quorum, maintaining cluster state,
|
|
||||||
responding to RPC queries, WAN gossip to other datacenters, and forwarding queries to leaders
|
|
||||||
or remote datacenters.
|
|
||||||
|
|
||||||
* Datacenter - A datacenter seems obvious, but there are subtle details such as multiple
|
|
||||||
availability zones in EC2. We define a datacenter to be a networking environment that is
|
|
||||||
private, low latency, and high bandwidth. This excludes communication that would traverse
|
|
||||||
the public internet.
|
|
||||||
|
|
||||||
* Consensus - When used in our documentation we use consensus to mean agreement upon
|
|
||||||
the elected leader as well as agreement on the ordering of transactions. Since these
|
|
||||||
transactions are applied to a FSM, we implicitly include the consistency of a replicated
|
|
||||||
state machine. Consensus is described in more detail on [Wikipedia](http://en.wikipedia.org/wiki/Consensus_(computer_science)),
|
|
||||||
as well as our [implementation here](/docs/internals/consensus.html).
|
|
||||||
|
|
||||||
* Gossip - Terraform is built on top of [Serf](http://www.serfdom.io/), which provides a full
|
|
||||||
[gossip protocol](http://en.wikipedia.org/wiki/Gossip_protocol) that is used for multiple purposes.
|
|
||||||
Serf provides membership, failure detection, and event broadcast mechanisms. Our use of these
|
|
||||||
is described more in the [gossip documentation](/docs/internals/gossip.html). It is enough to know
|
|
||||||
gossip involves random node-to-node communication, primarily over UDP.
|
|
||||||
|
|
||||||
* LAN Gossip - This is used to mean that there is a gossip pool, containing nodes that
|
|
||||||
are all located on the same local area network or datacenter.
|
|
||||||
|
|
||||||
* WAN Gossip - This is used to mean that there is a gossip pool, containing servers that
|
|
||||||
are primary located in different datacenters and must communicate over the internet or
|
|
||||||
wide area network.
|
|
||||||
|
|
||||||
* RPC - RPC is short for a Remote Procedure Call. This is a request / response mechanism
|
|
||||||
allowing a client to make a request from a server.
|
|
||||||
|
|
||||||
## 10,000 foot view
|
|
||||||
|
|
||||||
From a 10,000 foot altitude the architecture of Terraform looks like this:
|
|
||||||
|
|
||||||
![Terraform Architecture](/images/terraform-arch.png)
|
|
||||||
|
|
||||||
Lets break down this image and describe each piece. First of all we can see
|
|
||||||
that there are two datacenters, one and two respectively. Terraform has first
|
|
||||||
class support for multiple datacenters and expects this to be the common case.
|
|
||||||
|
|
||||||
Within each datacenter we have a mixture of clients and servers. It is expected
|
|
||||||
that there be between three to five servers. This strikes a balance between
|
|
||||||
availability in the case of failure and performance, as consensus gets progressively
|
|
||||||
slower as more machines are added. However, there is no limit to the number of clients,
|
|
||||||
and they can easily scale into the thousands or tens of thousands.
|
|
||||||
|
|
||||||
All the nodes that are in a datacenter participate in a [gossip protocol](/docs/internals/gossip.html).
|
|
||||||
This means there is a gossip pool that contains all the nodes for a given datacenter. This serves
|
|
||||||
a few purposes: first, there is no need to configure clients with the addresses of servers,
|
|
||||||
discovery is done automatically. Second, the work of detecting node failures
|
|
||||||
is not placed on the servers but is distributed. This makes the failure detection much more
|
|
||||||
scalable than naive heartbeating schemes. Thirdly, it is used as a messaging layer to notify
|
|
||||||
when important events such as leader election take place.
|
|
||||||
|
|
||||||
The servers in each datacenter are all part of a single Raft peer set. This means that
|
|
||||||
they work together to elect a leader, which has extra duties. The leader is responsible for
|
|
||||||
processing all queries and transactions. Transactions must also be replicated to all peers
|
|
||||||
as part of the [consensus protocol](/docs/internals/consensus.html). Because of this requirement,
|
|
||||||
when a non-leader server receives an RPC request it forwards it to the cluster leader.
|
|
||||||
|
|
||||||
The server nodes also operate as part of a WAN gossip. This pool is different from the LAN pool,
|
|
||||||
as it is optimized for the higher latency of the internet, and is expected to only contain
|
|
||||||
other Terraform server nodes. The purpose of this pool is to allow datacenters to discover each
|
|
||||||
other in a low touch manner. Bringing a new datacenter online is as easy as joining the existing
|
|
||||||
WAN gossip. Because the servers are all operating in this pool, it also enables cross-datacenter requests.
|
|
||||||
When a server receives a request for a different datacenter, it forwards it to a random server
|
|
||||||
in the correct datacenter. That server may then forward to the local leader.
|
|
||||||
|
|
||||||
This results in a very low coupling between datacenters, but because of failure detection,
|
|
||||||
connection caching and multiplexing, cross-datacenter requests are relatively fast and reliable.
|
|
||||||
|
|
||||||
## Getting in depth
|
|
||||||
|
|
||||||
At this point we've covered the high level architecture of Terraform, but there are much
|
|
||||||
more details to each of the sub-systems. The [consensus protocol](/docs/internals/consensus.html) is
|
|
||||||
documented in detail, as is the [gossip protocol](/docs/internals/gossip.html). The [documentation](/docs/internals/security.html)
|
|
||||||
for the security model and protocols used are also available.
|
|
||||||
|
|
||||||
For other details, either terraformt the code, ask in IRC or reach out to the mailing list.
|
|
||||||
|
|
|
@ -0,0 +1,98 @@
|
||||||
|
---
|
||||||
|
layout: "docs"
|
||||||
|
page_title: "Resource Graph"
|
||||||
|
sidebar_current: "docs-internals-graph"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Resource Graph
|
||||||
|
|
||||||
|
Terraform builds a
|
||||||
|
[dependency graph](http://en.wikipedia.org/wiki/Dependency_graph)
|
||||||
|
from the Terraform configurations, and walks this graph to
|
||||||
|
generate plans, refresh state, and more. This page documents
|
||||||
|
the details of what are contained in this graph, what types
|
||||||
|
of nodes there are, and how the edges of the graph are determined.
|
||||||
|
|
||||||
|
<div class="alert alert-block alert-warning">
|
||||||
|
<strong>Advanced Topic!</strong> This page covers technical details
|
||||||
|
of Terraform. You don't need to understand these details to
|
||||||
|
effectively use Terraform. The details are documented here for
|
||||||
|
those who wish to learn about them without having to go
|
||||||
|
spelunking through the source code.
|
||||||
|
</div>
|
||||||
|
|
||||||
|
## Graph Nodes
|
||||||
|
|
||||||
|
There are only a handful of node types that can exist within the
|
||||||
|
graph. We'll cover these first before explaining how they're
|
||||||
|
determined and built:
|
||||||
|
|
||||||
|
* **Resource Node** - Represents a single resource. If you have
|
||||||
|
the `count` metaparameter set, then there will be one resource
|
||||||
|
node for each count. The configuration, diff, state, etc. of
|
||||||
|
the resource under change is attached to this node.
|
||||||
|
|
||||||
|
* **Provider Configuration Node** - Represents the time to fully
|
||||||
|
configure a provider. This is when the provider configuration
|
||||||
|
block is given to a provider, such as AWS security credentials.
|
||||||
|
|
||||||
|
* **Resource Meta-Node** - Represents a group of resources, but
|
||||||
|
does not represent any action on its own. This is done for
|
||||||
|
convenience on dependencies and making a prettier graph. This
|
||||||
|
node is only present for resources that have a `count`
|
||||||
|
parameter greater than 1.
|
||||||
|
|
||||||
|
When visualizing a configuration with `terraform graph`, you can
|
||||||
|
see all of these nodes present.
|
||||||
|
|
||||||
|
## Building the Graph
|
||||||
|
|
||||||
|
Building the graph is done in a series of sequential steps:
|
||||||
|
|
||||||
|
1. Resources nodes are added based on the configuration. If a
|
||||||
|
diff (plan) or state is present, that meta-data is attached
|
||||||
|
to each resource node.
|
||||||
|
|
||||||
|
1. Resources are mapped to provisioners if they have any
|
||||||
|
defined. This must be done after all resource nodes are
|
||||||
|
created so resources with the same provisioner type can
|
||||||
|
share the provisioner implementation.
|
||||||
|
|
||||||
|
1. Explicit dependencies from the `depends_on` meta-parameter
|
||||||
|
are used to create edges between resources.
|
||||||
|
|
||||||
|
1. If a state is present, any "orphan" resources are added to
|
||||||
|
the graph. Orphan resources are any resources that are no
|
||||||
|
longer present in the configuration but are present in the
|
||||||
|
state file. Orphans never have any configuration associated
|
||||||
|
with them, since the state file does not store configuration.
|
||||||
|
|
||||||
|
1. Resources are mapped to providers. Provider configuration
|
||||||
|
nodes are created for these providers, and edges are created
|
||||||
|
such that the resources depend on their respective provider
|
||||||
|
being configured.
|
||||||
|
|
||||||
|
1. Interpolations are parsed in resource and provider configurations
|
||||||
|
to determine dependencies. References to resource attributes
|
||||||
|
are turned into dependencies from the resource with the interpolation
|
||||||
|
to the resource being referenced.
|
||||||
|
|
||||||
|
1. Create a root node. The root node points to all resources and
|
||||||
|
is created so there is a single root to the dependency graph. When
|
||||||
|
traversing the graph, the root node is ignored.
|
||||||
|
|
||||||
|
1. If a diff is present, traverse all resource nodes and find resources
|
||||||
|
that are being destroyed. These resource nodes are split into two:
|
||||||
|
one node that destroys the resource and another that creates
|
||||||
|
the resource (if it is being recreated). The reason the nodes must
|
||||||
|
be split is because the destroy order is often different from the
|
||||||
|
create order, and so they can't be represented by a single graph
|
||||||
|
node.
|
||||||
|
|
||||||
|
1. Validate the graph has no cycles and has a single root.
|
||||||
|
|
||||||
|
## Walking the Graph
|
||||||
|
|
||||||
|
To walk the graph, a standard depth-first traversal is done. Graph
|
||||||
|
walking is done with as much parallelism as possible: a node is walked
|
||||||
|
as soon as all of its dependencies are walked.
|
|
@ -0,0 +1,19 @@
|
||||||
|
---
|
||||||
|
layout: "docs"
|
||||||
|
page_title: "Internals"
|
||||||
|
sidebar_current: "docs-internals"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Terraform Internals
|
||||||
|
|
||||||
|
This section covers the internals of Terraform and explains how
|
||||||
|
plans are generated, the lifecycle of a provider, etc. The goal
|
||||||
|
of this section is to remove any notion of "magic" from Terraform.
|
||||||
|
We want you to be able to trust and understand what Terraform is
|
||||||
|
doing to function.
|
||||||
|
|
||||||
|
<div class="alert alert-block alert-info">
|
||||||
|
<strong>Note:</strong> Knowledge of Terraform internals is not
|
||||||
|
required to use Terraform. If you aren't interested in the internals
|
||||||
|
of Terraform, you may safely skip this section.
|
||||||
|
</div>
|
|
@ -0,0 +1,58 @@
|
||||||
|
---
|
||||||
|
layout: "docs"
|
||||||
|
page_title: "Resource Lifecycle"
|
||||||
|
sidebar_current: "docs-internals-lifecycle"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Resource Lifecycle
|
||||||
|
|
||||||
|
Resources have a strict lifecycle, and can be thought of as basic
|
||||||
|
state machines. Understanding this lifecycle can help better understand
|
||||||
|
how Terraform generates an execution plan, how it safely executes that
|
||||||
|
plan, and what the resource provider is doing throughout all of this.
|
||||||
|
|
||||||
|
<div class="alert alert-block alert-warning">
|
||||||
|
<strong>Advanced Topic!</strong> This page covers technical details
|
||||||
|
of Terraform. You don't need to understand these details to
|
||||||
|
effectively use Terraform. The details are documented here for
|
||||||
|
those who wish to learn about them without having to go
|
||||||
|
spelunking through the source code.
|
||||||
|
</div>
|
||||||
|
|
||||||
|
## Lifecycle
|
||||||
|
|
||||||
|
A resource roughly follows the steps below:
|
||||||
|
|
||||||
|
1. `ValidateResource` is called to do a high-level structural
|
||||||
|
validation of a resource's configuration. The configuration
|
||||||
|
at this point is raw and the interpolations have not been processed.
|
||||||
|
The value of any key is not guaranteed and is just meant to be
|
||||||
|
a quick structural check.
|
||||||
|
|
||||||
|
1. `Diff` is called with the current state and the configuration.
|
||||||
|
The resource provider inspects this and returns a diff, outlining
|
||||||
|
all the changes that need to occur to the resource. The diff includes
|
||||||
|
details such as whether or not the resource is being destroyed, what
|
||||||
|
attribute necessitates the destroy, old values and new values, whether
|
||||||
|
a value is computed, etc. It is up to the resource provider to
|
||||||
|
have this knowledge.
|
||||||
|
|
||||||
|
1. `Apply` is called with the current state and the diff. Apply does
|
||||||
|
not have access to the configuration. This is a safety mechanism
|
||||||
|
that limits the possibility that a provider changes a diff on the
|
||||||
|
fly. `Apply` must apply a diff as prescribed and do nothing else
|
||||||
|
to remain true to the Terraform execution plan. Apply returns the
|
||||||
|
new state of the resource (or nil if the resource was destroyed).
|
||||||
|
|
||||||
|
1. If a resource was just created and did not exist before, and the
|
||||||
|
apply succeeded without error, then the provisioners are executed
|
||||||
|
in sequence. If any provisioner errors, the resource is marked as
|
||||||
|
_tainted_, so that it will be destroyed on the next apply.
|
||||||
|
|
||||||
|
## Partial State and Error Handling
|
||||||
|
|
||||||
|
If an error happens at any stage in the lifecycle of a resource,
|
||||||
|
Terraform stores a partial state of the resource. This behavior is
|
||||||
|
critical for Terraform to ensure that you don't end up with any
|
||||||
|
_zombie_ resources: resources that were created by Terraform but
|
||||||
|
no longer managed by Terraform due to a loss of state.
|
|
@ -158,28 +158,12 @@
|
||||||
<li<%= sidebar_current("docs-internals") %>>
|
<li<%= sidebar_current("docs-internals") %>>
|
||||||
<a href="/docs/internals/index.html">Internals</a>
|
<a href="/docs/internals/index.html">Internals</a>
|
||||||
<ul class="nav">
|
<ul class="nav">
|
||||||
<li<%= sidebar_current("docs-internals-architecture") %>>
|
<li<%= sidebar_current("docs-internals-graph") %>>
|
||||||
<a href="/docs/internals/architecture.html">Architecture</a>
|
<a href="/docs/internals/graph.html">Resource Graph</a>
|
||||||
</li>
|
</li>
|
||||||
|
|
||||||
<li<%= sidebar_current("docs-internals-consensus") %>>
|
<li<%= sidebar_current("docs-internals-lifecycle") %>>
|
||||||
<a href="/docs/internals/consensus.html">Consensus Protocol</a>
|
<a href="/docs/internals/lifecycle.html">Resource Lifecycle</a>
|
||||||
</li>
|
|
||||||
|
|
||||||
<li<%= sidebar_current("docs-internals-gossip") %>>
|
|
||||||
<a href="/docs/internals/gossip.html">Gossip Protocol</a>
|
|
||||||
</li>
|
|
||||||
|
|
||||||
<li<%= sidebar_current("docs-internals-sessions") %>>
|
|
||||||
<a href="/docs/internals/sessions.html">Sessions</a>
|
|
||||||
</li>
|
|
||||||
|
|
||||||
<li<%= sidebar_current("docs-internals-security") %>>
|
|
||||||
<a href="/docs/internals/security.html">Security Model</a>
|
|
||||||
</li>
|
|
||||||
|
|
||||||
<li<%= sidebar_current("docs-internals-jepsen") %>>
|
|
||||||
<a href="/docs/internals/jepsen.html">Jepsen Testing</a>
|
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
|
Loading…
Reference in New Issue