website: vs page

2014-07-25 10:51:30 -04:00 · 2014-07-25 10:51:30 -04:00 · 0850e5691c
parent a42661004d
commit 0850e5691c
6 changed files with 7 additions and 259 deletions
--- a/website/source/intro/vs/index.html.markdown
+++ b/website/source/intro/vs/index.html.markdown
@ -6,11 +6,13 @@ sidebar_current: "vs-other"
 # Terraform vs. Other Software
-The problems Terraform solves are varied, but each individual feature has been
+Terraform provides a flexible abstraction of resources and providers. This model
-solved by many different systems. Although there is no single system that provides
+allows for representing everything from physical hardware, virtual machines,
-all the features of Terraform, there are other options available to solve some of these problems.
+containers, to email and DNS providers. Because of this flexibility, Terraform
-In this section, we compare Terraform to some other options. In most cases, Terraform is not
+can be used to solve many different problems. This means there are a number of
-mutually exclusive with any other system.
+exiting tools that overlap with the capabilities of Terraform. We compare Terraform
 to a numbers of these tools, but it's good to note that Terraform is not mutual
 exclusive with any other system, nor does it require total buy-in to be useful.
 Use the navigation to the left to read the comparison of Terraform to specific
 systems.
--- a/website/source/intro/vs/nagios-sensu.html.markdown
+++ b/website/source/intro/vs/nagios-sensu.html.markdown
@ -1,49 +0,0 @@
 ---
 layout: "intro"
 page_title: "Terraform vs. Nagios, Sensu"
 sidebar_current: "vs-other-nagios-sensu"
 ---
 # Terraform vs. Nagios, Sensu
 Nagios and Sensu are both tools built for monitoring. They are used
 to quickly notify operators when an issue occurs.
 Nagios uses a group of central servers that are configured to perform
 checks on remote hosts. This design makes it difficult to scale Nagios,
 as large fleets quickly reach the limit of vertical scaling, and Nagios
 does not easily scale horizontally. Nagios is also notoriously
 difficult to use with modern DevOps and configuration management tools,
 as local configurations must be updated when remote servers are added
 or removed.
 Sensu has a much more modern design, relying on local agents to run
 checks and pushing results to an AMQP broker. A number of servers
 ingest and handle the result of the health checks from the broker. This model
 is more scalable than Nagios, as it allows for much more horizontal scaling,
 and a weaker coupling between the servers and agents. However, the central broker
 has scaling limits, and acts as a single point of failure in the system.
 Terraform provides the same health checking abilities as both Nagios and Sensu,
 is friendly to modern DevOps, and avoids the scaling issues inherent in the
 other systems. Terraform runs all checks locally, like Sensu, avoiding placing
 a burden on central servers. The status of checks is maintained by the Terraform
 servers, which are fault tolerant and have no single point of failure.
 Lastly, Terraform can scale to vastly more checks because it relies on edge triggered
 updates. This means that an update is only triggered when a check transitions
 from "passing" to "failing" or vice versa.
 In a large fleet, the majority of checks are passing, and even the minority
 that are failing are persistent. By capturing changes only, Terraform reduces
 the amount of networking and compute resources used by the health checks,
 allowing the system to be much more scalable.
 An astute reader may notice that if a Terraform agent dies, then no edge triggered
 updates will occur. From the perspective of other nodes all checks will appear
 to be in a steady state. However, Terraform guards against this as well. The
 [gossip protocol](/docs/internals/gossip.html) used between clients and servers
 integrates a distributed failure detector. This means that if a Terraform agent fails,
 the failure will be detected, and thus all checks being run by that node can be
 assumed failed. This failure detector distributes the work among the entire cluster,
 and critically enables the edge triggered architecture to work.
--- a/website/source/intro/vs/serf.html.markdown
+++ b/website/source/intro/vs/serf.html.markdown
@ -1,46 +0,0 @@
 ---
 layout: "intro"
 page_title: "Terraform vs. Serf"
 sidebar_current: "vs-other-serf"
 ---
 # Terraform vs. Serf
 [Serf](http://www.serfdom.io) is a node discovery and orchestration tool and is the only
 tool discussed so far that is built on an eventually consistent gossip model,
 with no centralized servers. It provides a number of features, including group
 membership, failure detection, event broadcasts and a query mechanism. However,
 Serf does not provide any high-level features such as service discovery, health
 checking or key/value storage. To clarify, the discovery feature of Serf is at a node
 level, while Terraform provides a service and node level abstraction.
 Terraform is a complete system providing all of those features. In fact, the internal
 [gossip protocol](/docs/internals/gossip.html) used within Terraform, is powered by
 the Serf library. Terraform leverages the membership and failure detection features,
 and builds upon them.
 The health checking provided by Serf is very low level, and only indicates if the
 agent is alive. Terraform extends this to provide a rich health checking system,
 that handles liveness, in addition to arbitrary host and service-level checks.
 Health checks are integrated with a central catalog that operators can easily
 query to gain insight into the cluster.
 The membership provided by Serf is at a node level, while Terraform focuses
 on the service level abstraction, with a single node to multiple service model.
 This can be simulated in Serf using tags, but it is much more limited, and does
 not provide useful query interfaces. Terraform also makes use of a strongly consistent
 Catalog, while Serf is only eventually consistent.
 In addition to the service level abstraction and improved health checking,
 Terraform provides a key/value store and support for multiple datacenters.
 Serf can run across the WAN but with degraded performance. Terraform makes use
 of [multiple gossip pools](/docs/internals/architecture.html), so that
 the performance of Serf over a LAN can be retained while still using it over
 a WAN for linking together multiple datacenters.
 Terraform is opinionated in its usage, while Serf is a more flexible and
 general purpose tool. Terraform uses a CP architecture, favoring consistency over
 availability. Serf is a AP system, and sacrifices consistency for availability.
 This means Terraform cannot operate if the central servers cannot form a quorum,
 while Serf will continue to function under almost all circumstances.
--- a/website/source/intro/vs/skydns.html.markdown
+++ b/website/source/intro/vs/skydns.html.markdown
@ -1,41 +0,0 @@
 ---
 layout: "intro"
 page_title: "Terraform vs. SkyDNS"
 sidebar_current: "vs-other-skydns"
 ---
 # Terraform vs. SkyDNS
 SkyDNS is a relatively new tool designed to solve service discovery.
 It uses multiple central servers that are strongly consistent and
 fault tolerant. Nodes register services using an HTTP API, and
 queries can be made over HTTP or DNS to perform discovery.
 Terraform is very similar, but provides a superset of features. Terraform
 also relies on multiple central servers to provide strong consistency
 and fault tolerance. Nodes can use an HTTP API or use an agent to
 register services, and queries are made over HTTP or DNS.
 However, the systems differ in many ways. Terraform provides a much richer
 health checking framework, with support for arbitrary checks and
 a highly scalable failure detection scheme. SkyDNS relies on naive
 heartbeating and TTLs, which have known scalability issues. Additionally,
 the heartbeat only provides a limited liveness check, versus the rich
 health checks that Terraform is capable of.
 Multiple datacenters can be supported by using "regions" in SkyDNS,
 however the data is managed and queried from a single cluster. If servers
 are split between datacenters the replication protocol will suffer from
 very long commit times. If all the SkyDNS servers are in a central datacenter, then
 connectivity issues can cause entire datacenters to lose availability.
 Additionally, even without a connectivity issue, query performance will
 suffer as requests must always be performed in a remote datacenter.
 Terraform supports multiple datacenters out of the box, and it purposely
 scopes the managed data to be per-datacenter. This means each datacenter
 runs an independent cluster of servers. Requests are forwarded to remote
 datacenters if necessary. This means requests for services within a datacenter
 never go over the WAN, and connectivity issues between datacenters do not
 affect availability within a datacenter. Additionally, the unavailability
 of one datacenter does not affect the service discovery of services
 in any other datacenter.
--- a/website/source/intro/vs/smartstack.html.markdown
+++ b/website/source/intro/vs/smartstack.html.markdown
@ -1,57 +0,0 @@
 ---
 layout: "intro"
 page_title: "Terraform vs. SmartStack"
 sidebar_current: "vs-other-smartstack"
 ---
 # Terraform vs. SmartStack
 SmartStack is another tool which tackles the service discovery problem.
 It has a rather unique architecture, and has 4 major components: ZooKeeper,
 HAProxy, Synapse, and Nerve. The ZooKeeper servers are responsible for storing cluster
 state in a consistent and fault tolerant manner. Each node in the SmartStack
 cluster then runs both Nerves and Synapses. The Nerve is responsible for running
 health checks against a service, and registering with the ZooKeeper servers.
 Synapse queries ZooKeeper for service providers and dynamically configures
 HAProxy. Finally, clients speak to HAProxy, which does health checking and
 load balancing across service providers.
 Terraform is a much simpler and more contained system, as it does not rely on any external
 components. Terraform uses an integrated [gossip protocol](/docs/internals/gossip.html)
 to track all nodes and perform server discovery. This means that server addresses
 do not need to be hardcoded and updated fleet wide on changes, unlike SmartStack.
 Service registration for both Terraform and Nerves can be done with a configuration file,
 but Terraform also supports an API to dynamically change the services and checks that are in use.
 For discovery, SmartStack clients must use HAProxy, requiring that Synapse be
 configured with all desired endpoints in advance. Terraform clients instead
 use the DNS or HTTP APIs without any configuration needed in advance. Terraform
 also provides a "tag" abstraction, allowing services to provide metadata such
 as versions, primary/secondary designations, or opaque labels that can be used for
 filtering. Clients can then request only the service providers which have
 matching tags.
 The systems also differ in how they manage health checking.
 Nerve's performs local health checks in a manner similar to Terraform agents.
 However, Terraform maintains separate catalog and health systems, which allow
 operators to see which nodes are in each service pool, as well as providing
 insight into failing checks. Nerve simply deregisters nodes on failed checks,
 providing limited operator insight. Synapse also configures HAProxy to perform
 additional health checks. This causes all potential service clients to check for
 liveness. With large fleets, this N-to-N style health checking may be prohibitively
 expensive.
 Terraform generally provides a much richer health checking system. Terraform supports
 Nagios style plugins, enabling a vast catalog of checks to be used. It also
 allows for service and host-level checks. There is also a "dead man's switch"
 check that allows applications to easily integrate custom health checks. All of this
 is also integrated into a Health and Catalog system with APIs enabling operator
 to gain insight into the broader system.
 In addition to the service discovery and health checking, Terraform also provides
 an integrated key/value store for configuration and multi-datacenter support.
 While it may be possible to configure SmartStack for multiple datacenters,
 the central ZooKeeper cluster would be a serious impediment to a fault tolerant
 deployment.
--- a/website/source/intro/vs/zookeeper.html.markdown
+++ b/website/source/intro/vs/zookeeper.html.markdown
@ -1,61 +0,0 @@
 ---
 layout: "intro"
 page_title: "Terraform vs. ZooKeeper, doozerd, etcd"
 sidebar_current: "vs-other-zk"
 ---
 # Terraform vs. ZooKeeper, doozerd, etcd
 ZooKeeper, doozerd and etcd are all similar in their architecture.
 All three have server nodes that require a quorum of nodes to operate (usually a simple majority).
 They are strongly consistent, and expose various primitives that can be used
 through client libraries within applications to build complex distributed systems.
 Terraform works in a similar way within a single datacenter with only server nodes.
 In each datacenter, Terraform servers require a quorum to operate
 and provide strong consistency. However, Terraform has native support for multiple datacenters,
 as well as a more complex gossip system that links server nodes and clients.
 If any of these systems are used for pure key/value storage, then they all
 roughly provide the same semantics. Reads are strongly consistent, and availability
 is sacrificed for consistency in the face of a network partition. However, the differences
 become more apparent when these systems are used for advanced cases.
 The semantics provided by these systems are attractive for building
 service discovery systems. ZooKeeper et al. provide only a primitive K/V store,
 and require that application developers build their own system to provide service
 discovery. Terraform provides an opinionated framework for service discovery, and
 eliminates the guess work and development effort. Clients simply register services
 and then perform discovery using a DNS or HTTP interface. Other systems
 require a home-rolled solution.
 A compelling service discovery framework must incorporate health checking and the
 possibility of failures as well. It is not useful to know that Node A
 provides the Foo service if that node has failed or the service crashed. Naive systems
 make use of heartbeating, using periodic updates and TTLs. These schemes require work linear
 to the number of nodes and place the demand on a fixed number of servers. Additionally, the
 failure detection window is at least as long as the TTL. ZooKeeper provides ephemeral
 nodes which are K/V entries that are removed when a client disconnects. These are more
 sophisticated than a heartbeat system, but also have inherent scalability issues and add
 client side complexity. All clients must maintain active connections to the ZooKeeper servers,
 and perform keep-alives. Additionally, this requires "thick clients", which are difficult
 to write and often result in difficult to debug issues.
 Terraform uses a very different architecture for health checking. Instead of only
 having server nodes, Terraform clients run on every node in the cluster.
 These clients are part of a [gossip pool](/docs/internals/gossip.html), which
 serves several functions including distributed health checking. The gossip protocol implements
 an efficient failure detector that can scale to clusters of any size without concentrating
 the work on any select group of servers. The clients also enable a much richer set of health checks to be run locally,
 whereas ZooKeeper ephemeral nodes are a very primitive check of liveness. Clients can check that
 a web server is returning 200 status codes, that memory utilization is not critical, there is sufficient
 disk space, etc. The Terraform clients expose a simple HTTP interface and avoid exposing the complexity
 of the system is to clients in the same way as ZooKeeper.
 Terraform provides first class support for service discovery, health checking,
 K/V storage, and multiple datacenters. To support anything more than simple K/V storage,
 all these other systems require additional tools and libraries to be built on
 top. By using client nodes, Terraform provides a simple API that only requires thin clients.
 Additionally, the API can be avoided entirely by using configuration files and the
 DNS interface to have a complete service discovery solution with no development at all.