website: internals

This commit is contained in:
Mitchell Hashimoto 2014-07-25 21:37:22 -07:00
parent 9ff8856fe8
commit d5049b65da
5 changed files with 179 additions and 137 deletions

View File

@ -1,117 +0,0 @@
---
layout: "docs"
page_title: "Terraform Architecture"
sidebar_current: "docs-internals-architecture"
---
# Terraform Architecture
Terraform is a complex system that has many different moving parts. To help
users and developers of Terraform form a mental model of how it works, this
page documents the system architecture.
<div class="alert alert-block alert-warning">
<strong>Advanced Topic!</strong> This page covers technical details of
the internals of Terraform. You don't need to know these details to effectively
operate and use Terraform. These details are documented here for those who wish
to learn about them without having to go spelunking through the source code.
</div>
## Glossary
Before describing the architecture, we provide a glossary of terms to help
clarify what is being discussed:
* Agent - An agent is the long running daemon on every member of the Terraform cluster.
It is started by running `terraform agent`. The agent is able to run in either *client*,
or *server* mode. Since all nodes must be running an agent, it is simpler to refer to
the node as either being a client or server, but there are other instances of the agent. All
agents can run the DNS or HTTP interfaces, and are responsible for running checks and
keeping services in sync.
* Client - A client is an agent that forwards all RPCs to a server. The client is relatively
stateless. The only background activity a client performs is taking part of LAN gossip pool.
This has a minimal resource overhead and consumes only a small amount of network bandwidth.
* Server - An agent that is server mode. When in server mode, there is an expanded set
of responsibilities including participating in the Raft quorum, maintaining cluster state,
responding to RPC queries, WAN gossip to other datacenters, and forwarding queries to leaders
or remote datacenters.
* Datacenter - A datacenter seems obvious, but there are subtle details such as multiple
availability zones in EC2. We define a datacenter to be a networking environment that is
private, low latency, and high bandwidth. This excludes communication that would traverse
the public internet.
* Consensus - When used in our documentation we use consensus to mean agreement upon
the elected leader as well as agreement on the ordering of transactions. Since these
transactions are applied to a FSM, we implicitly include the consistency of a replicated
state machine. Consensus is described in more detail on [Wikipedia](http://en.wikipedia.org/wiki/Consensus_(computer_science)),
as well as our [implementation here](/docs/internals/consensus.html).
* Gossip - Terraform is built on top of [Serf](http://www.serfdom.io/), which provides a full
[gossip protocol](http://en.wikipedia.org/wiki/Gossip_protocol) that is used for multiple purposes.
Serf provides membership, failure detection, and event broadcast mechanisms. Our use of these
is described more in the [gossip documentation](/docs/internals/gossip.html). It is enough to know
gossip involves random node-to-node communication, primarily over UDP.
* LAN Gossip - This is used to mean that there is a gossip pool, containing nodes that
are all located on the same local area network or datacenter.
* WAN Gossip - This is used to mean that there is a gossip pool, containing servers that
are primary located in different datacenters and must communicate over the internet or
wide area network.
* RPC - RPC is short for a Remote Procedure Call. This is a request / response mechanism
allowing a client to make a request from a server.
## 10,000 foot view
From a 10,000 foot altitude the architecture of Terraform looks like this:
![Terraform Architecture](/images/terraform-arch.png)
Lets break down this image and describe each piece. First of all we can see
that there are two datacenters, one and two respectively. Terraform has first
class support for multiple datacenters and expects this to be the common case.
Within each datacenter we have a mixture of clients and servers. It is expected
that there be between three to five servers. This strikes a balance between
availability in the case of failure and performance, as consensus gets progressively
slower as more machines are added. However, there is no limit to the number of clients,
and they can easily scale into the thousands or tens of thousands.
All the nodes that are in a datacenter participate in a [gossip protocol](/docs/internals/gossip.html).
This means there is a gossip pool that contains all the nodes for a given datacenter. This serves
a few purposes: first, there is no need to configure clients with the addresses of servers,
discovery is done automatically. Second, the work of detecting node failures
is not placed on the servers but is distributed. This makes the failure detection much more
scalable than naive heartbeating schemes. Thirdly, it is used as a messaging layer to notify
when important events such as leader election take place.
The servers in each datacenter are all part of a single Raft peer set. This means that
they work together to elect a leader, which has extra duties. The leader is responsible for
processing all queries and transactions. Transactions must also be replicated to all peers
as part of the [consensus protocol](/docs/internals/consensus.html). Because of this requirement,
when a non-leader server receives an RPC request it forwards it to the cluster leader.
The server nodes also operate as part of a WAN gossip. This pool is different from the LAN pool,
as it is optimized for the higher latency of the internet, and is expected to only contain
other Terraform server nodes. The purpose of this pool is to allow datacenters to discover each
other in a low touch manner. Bringing a new datacenter online is as easy as joining the existing
WAN gossip. Because the servers are all operating in this pool, it also enables cross-datacenter requests.
When a server receives a request for a different datacenter, it forwards it to a random server
in the correct datacenter. That server may then forward to the local leader.
This results in a very low coupling between datacenters, but because of failure detection,
connection caching and multiplexing, cross-datacenter requests are relatively fast and reliable.
## Getting in depth
At this point we've covered the high level architecture of Terraform, but there are much
more details to each of the sub-systems. The [consensus protocol](/docs/internals/consensus.html) is
documented in detail, as is the [gossip protocol](/docs/internals/gossip.html). The [documentation](/docs/internals/security.html)
for the security model and protocols used are also available.
For other details, either terraformt the code, ask in IRC or reach out to the mailing list.

View File

@ -0,0 +1,98 @@
---
layout: "docs"
page_title: "Resource Graph"
sidebar_current: "docs-internals-graph"
---
# Resource Graph
Terraform builds a
[dependency graph](http://en.wikipedia.org/wiki/Dependency_graph)
from the Terraform configurations, and walks this graph to
generate plans, refresh state, and more. This page documents
the details of what are contained in this graph, what types
of nodes there are, and how the edges of the graph are determined.
<div class="alert alert-block alert-warning">
<strong>Advanced Topic!</strong> This page covers technical details
of Terraform. You don't need to understand these details to
effectively use Terraform. The details are documented here for
those who wish to learn about them without having to go
spelunking through the source code.
</div>
## Graph Nodes
There are only a handful of node types that can exist within the
graph. We'll cover these first before explaining how they're
determined and built:
* **Resource Node** - Represents a single resource. If you have
the `count` metaparameter set, then there will be one resource
node for each count. The configuration, diff, state, etc. of
the resource under change is attached to this node.
* **Provider Configuration Node** - Represents the time to fully
configure a provider. This is when the provider configuration
block is given to a provider, such as AWS security credentials.
* **Resource Meta-Node** - Represents a group of resources, but
does not represent any action on its own. This is done for
convenience on dependencies and making a prettier graph. This
node is only present for resources that have a `count`
parameter greater than 1.
When visualizing a configuration with `terraform graph`, you can
see all of these nodes present.
## Building the Graph
Building the graph is done in a series of sequential steps:
1. Resources nodes are added based on the configuration. If a
diff (plan) or state is present, that meta-data is attached
to each resource node.
1. Resources are mapped to provisioners if they have any
defined. This must be done after all resource nodes are
created so resources with the same provisioner type can
share the provisioner implementation.
1. Explicit dependencies from the `depends_on` meta-parameter
are used to create edges between resources.
1. If a state is present, any "orphan" resources are added to
the graph. Orphan resources are any resources that are no
longer present in the configuration but are present in the
state file. Orphans never have any configuration associated
with them, since the state file does not store configuration.
1. Resources are mapped to providers. Provider configuration
nodes are created for these providers, and edges are created
such that the resources depend on their respective provider
being configured.
1. Interpolations are parsed in resource and provider configurations
to determine dependencies. References to resource attributes
are turned into dependencies from the resource with the interpolation
to the resource being referenced.
1. Create a root node. The root node points to all resources and
is created so there is a single root to the dependency graph. When
traversing the graph, the root node is ignored.
1. If a diff is present, traverse all resource nodes and find resources
that are being destroyed. These resource nodes are split into two:
one node that destroys the resource and another that creates
the resource (if it is being recreated). The reason the nodes must
be split is because the destroy order is often different from the
create order, and so they can't be represented by a single graph
node.
1. Validate the graph has no cycles and has a single root.
## Walking the Graph
To walk the graph, a standard depth-first traversal is done. Graph
walking is done with as much parallelism as possible: a node is walked
as soon as all of its dependencies are walked.

View File

@ -0,0 +1,19 @@
---
layout: "docs"
page_title: "Internals"
sidebar_current: "docs-internals"
---
# Terraform Internals
This section covers the internals of Terraform and explains how
plans are generated, the lifecycle of a provider, etc. The goal
of this section is to remove any notion of "magic" from Terraform.
We want you to be able to trust and understand what Terraform is
doing to function.
<div class="alert alert-block alert-info">
<strong>Note:</strong> Knowledge of Terraform internals is not
required to use Terraform. If you aren't interested in the internals
of Terraform, you may safely skip this section.
</div>

View File

@ -0,0 +1,58 @@
---
layout: "docs"
page_title: "Resource Lifecycle"
sidebar_current: "docs-internals-lifecycle"
---
# Resource Lifecycle
Resources have a strict lifecycle, and can be thought of as basic
state machines. Understanding this lifecycle can help better understand
how Terraform generates an execution plan, how it safely executes that
plan, and what the resource provider is doing throughout all of this.
<div class="alert alert-block alert-warning">
<strong>Advanced Topic!</strong> This page covers technical details
of Terraform. You don't need to understand these details to
effectively use Terraform. The details are documented here for
those who wish to learn about them without having to go
spelunking through the source code.
</div>
## Lifecycle
A resource roughly follows the steps below:
1. `ValidateResource` is called to do a high-level structural
validation of a resource's configuration. The configuration
at this point is raw and the interpolations have not been processed.
The value of any key is not guaranteed and is just meant to be
a quick structural check.
1. `Diff` is called with the current state and the configuration.
The resource provider inspects this and returns a diff, outlining
all the changes that need to occur to the resource. The diff includes
details such as whether or not the resource is being destroyed, what
attribute necessitates the destroy, old values and new values, whether
a value is computed, etc. It is up to the resource provider to
have this knowledge.
1. `Apply` is called with the current state and the diff. Apply does
not have access to the configuration. This is a safety mechanism
that limits the possibility that a provider changes a diff on the
fly. `Apply` must apply a diff as prescribed and do nothing else
to remain true to the Terraform execution plan. Apply returns the
new state of the resource (or nil if the resource was destroyed).
1. If a resource was just created and did not exist before, and the
apply succeeded without error, then the provisioners are executed
in sequence. If any provisioner errors, the resource is marked as
_tainted_, so that it will be destroyed on the next apply.
## Partial State and Error Handling
If an error happens at any stage in the lifecycle of a resource,
Terraform stores a partial state of the resource. This behavior is
critical for Terraform to ensure that you don't end up with any
_zombie_ resources: resources that were created by Terraform but
no longer managed by Terraform due to a loss of state.

View File

@ -158,28 +158,12 @@
<li<%= sidebar_current("docs-internals") %>> <li<%= sidebar_current("docs-internals") %>>
<a href="/docs/internals/index.html">Internals</a> <a href="/docs/internals/index.html">Internals</a>
<ul class="nav"> <ul class="nav">
<li<%= sidebar_current("docs-internals-architecture") %>> <li<%= sidebar_current("docs-internals-graph") %>>
<a href="/docs/internals/architecture.html">Architecture</a> <a href="/docs/internals/graph.html">Resource Graph</a>
</li> </li>
<li<%= sidebar_current("docs-internals-consensus") %>> <li<%= sidebar_current("docs-internals-lifecycle") %>>
<a href="/docs/internals/consensus.html">Consensus Protocol</a> <a href="/docs/internals/lifecycle.html">Resource Lifecycle</a>
</li>
<li<%= sidebar_current("docs-internals-gossip") %>>
<a href="/docs/internals/gossip.html">Gossip Protocol</a>
</li>
<li<%= sidebar_current("docs-internals-sessions") %>>
<a href="/docs/internals/sessions.html">Sessions</a>
</li>
<li<%= sidebar_current("docs-internals-security") %>>
<a href="/docs/internals/security.html">Security Model</a>
</li>
<li<%= sidebar_current("docs-internals-jepsen") %>>
<a href="/docs/internals/jepsen.html">Jepsen Testing</a>
</li> </li>
</ul> </ul>
</li> </li>