The AWS Go SDK automatically provides a default request retryer with exponential backoff that is invoked via setting `MaxRetries` or leaving it `nil` will default to 3. The terraform-aws-provider `config.Client()` sets `MaxRetries` to 0 unless explicitly configured above 0. Previously, we were not overriding this behavior by setting the configuration and therefore not invoking the default request retryer.
The default retryer already handles HTTP error codes above 500, including S3's InternalError response, so the extraneous handling can be removed. This will also start automatically retrying many additional cases, such as temporary networking issues or other retryable AWS service responses.
Changes:
* s3/backend: Add `max_retries` argument
* s3/backend: Enhance S3 NoSuchBucket error to include additional information
* Upgrading to 2.0.0 of github.com/hashicorp/go-azure-helpers
* Support for authenticating using Azure CLI
* backend/azurerm: support for authenticating using the Azure CLI
This PR improves the error handling so we can provide better feedback about any service discovery errors that occured.
Additionally it adds logic to test for specific versions when discovering a service using `service.vN`. This will enable more informational errors which can indicate any version incompatibilities.
This change enables a few related use cases:
* AWS has partitions outside Commercial, GovCloud (US), and China, which are the only endpoints automatically handled by the AWS Go SDK. DynamoDB locking and credential verification can not currently be enabled in those regions.
* Allows usage of any DynamoDB-compatible API for state locking
* Allows usage of any IAM/STS-compatible API for credential verification
Use the entitlements to a) determine if the organization exists, and b) as a means to select which backend to use (the local backend with remote state, or the remote backend).
Variables values are marshalled with an explicit type of
cty.DynamicPseudoType, but were being decoded using `Implied Type` to
try and guess the type. This was causing errors because `Implied Type`
does not expect to find a late-bound value.
If an instance object in state has an earlier schema version number then
it is likely that the schema we're holding won't be able to decode the
raw data that is stored. Instead, we must ask the provider to upgrade it
for us first, which might also include translating it from flatmap form
if it was last updated with a Terraform version earlier than v0.12.
This ends up being a "seam" between our use of int64 for schema versions
in the providers package and uint64 everywhere else. We intend to
standardize on int64 everywhere eventually, but for now this remains
consistent with existing usage in each layer to keep the type conversion
noise contained here and avoid mass-updates to other Terraform components
at this time.
This also includes a minor change to the test helpers for the
backend/local package, which were inexplicably setting a SchemaVersion of
1 on the basic test state but setting the mock schema version to zero,
creating an invalid situation where the state would need to be downgraded.
Add support for the new `force-unlock` API and at the same time improve
performance a bit by reducing the amount of API calls made when using
the remote backend for state storage only.
Previously we were fetching these from the provider but then immediately
discarding the version numbers because the schema API had nowhere to put
them.
To avoid a late-breaking change to the internal structure of
terraform.ProviderSchema (which is constructed directly all over the
tests) we're retaining the resource type schemas in a new map alongside
the existing one with the same keys, rather than just switching to
using the providers.Schema struct directly there.
The methods that return resource type schemas now return two arguments,
intentionally creating a little API friction here so each new caller can
be reminded to think about whether they need to do something with the
schema version, though it can be ignored by many callers.
Since this was a breaking change to the Schemas API anyway, this also
fixes another API wart where there was a separate method for fetching
managed vs. data resource types and thus every caller ended up having a
switch statement on "mode". Now we just accept mode as an argument and
do the switch statement within the single SchemaForResourceType method.
* backend/azurerm: removing the `arm_` prefix from keys
* removing the deprecated fields test because the deprecation makes it fail
* authentication: support for custom resource manager endpoints
* Adding debug prefixes to the log statements
* adding acceptance tests for msi auth
* including the resource group name in the tests
* backend/azurerm: support for authenticating using a SAS Token
* resolving merge conflicts
* moving the defer to prior to the error
* backend/azurerm: support for authenticating via msi
* adding acceptance tests for msi auth
* including the resource group name in the tests
* support for using the test client via msi
* vendor updates
- updating to v21.3.0 of github.com/Azure/azure-sdk-for-go
- updating to v10.15.4 of github.com/Azure/go-autorest
- vendoring github.com/hashicorp/go-azure-helpers @ 0.1.1
* backend/azurerm: refactoring to use the new auth package
- refactoring the backend to use a shared client via the new auth package
- adding tests covering both Service Principal and Access Key auth
- support for authenticating using a proxy
- rewriting the backend documentation to include examples of both authentication types
* switching to use the build-in logging function
* documenting it's also possible to retrieve the access key from an env var
In order to support free organizations, we need a way to load the `remote` backend and then, depending on the used offering/plan, enable or disable remote operations.
In other words, we should be able to dynamically fall back to the `local` backend if needed, after first configuring the `remote` backend.
To make this works we need to change the way this was done previously when the env var `TF_FORCE_LOCAL_BACKEND` was set. The clear difference of course being that the env var would be available on startup, while the used offering/plan is only known after being able to connect to TFE.
The changes to how we handle setting the state path on the local backend
broke the heuristic we were using here for detecting migration from one
local backend to another with the same state path, which would by default
end up deleting the state altogether after migration.
We now use the StatePaths method to do this, which takes into account
both the default values and any settings that have been set.
Additionally this addresses a flaw in the old method which could
potentially have deleted all non-default workspace state files if the
"path" setting were changed without also changing the "workspace_dir"
setting. This new approach is conservative because it will preserve all
of the files if any one overlaps.
This was failing because we now handle the settings for the local backend
a little differently as a result of decoding it with the HCL2 machinery.
Specifically, the backend.State* fields are now assumed to be what is
given in configuration, and any CLI overrides are maintained separately
in OverrideState* fields so that they can be imposed "just in time" in
StatePaths.
This is particularly important because OverrideStatePath (when set) is
used regardless of workspace name, while StatePath is a suitable value
only for the "default" workspace, with others needing to be constructed
from StateWorkspaceDir instead.
Newer versions of the retryablehttp package use a context, so we need to
add that in our custom `CheckRetry` function.
In addition I removed the `return true, nil` to continue retrying in
case of an error, and instead directly call the `DefaultRetryPolicy`.
This is because the `DefaultRetryPolicy` will now also take the context
into consideration.
This new source type should be used for variables loaded from .tfvars files that were explicitly passed as command line arguments (e.g. -var-file=foo.tfvars)
This work was done against APIs that were already changed in the branch
before work began, and so it doesn't apply to the v0.12 development work.
To allow v0.12 to merge down to master, we'll revert this work out for now
and then re-introduce equivalent functionality in later commits that works
against the new APIs.
There are several steps here and a number of them can include reaching out
to remote servers or executing local processes, so it's helpful to have
some trace logs to better narrow down causes of errors and hangs during
this step.
In earlier refactoring we skipped implementing prior state safety checks,
propagating the target addresses from plan, and verifying that all of
the providers are exactly the same from the plan being created.
This change reinstates those checks, including a new error message for
the "stale plan" situation.
If we don't do this, we can't produce any output when applying a saved
plan file.
Here we also introduce a check to the local backend's ReportResult
function so that it won't panic if CLI init is skipped, although that
will no longer happen in the apply-from-file case due to the change
described in the previous paragraph.
We can't generate a valid plan file without a backend configuration to
write into it, but it's the responsibility of the caller (the command
package) to manage the backend configuration mechanism, so we require it
to tell us what to write here.
This feels a little strange because the backend in principle knows its
own config, but in practice the backend only knows the _processed_ version
of the config, not the raw configuration value that was used to configure
it.
converted the existing testPlanState() from terraform.State to
states.State to fix various plan tests.
reverted the "bandaid" in plans/planfile/tfplan.go - at this moment the
backend tests do not include backend configuration, and so the planfile
package can write the plan file but not read it back in. That will be
revisted in a separate track of work.
I have no confidence in the change to plans/planfile/tfplan.go. The
tests were passing an empty backend config, which planfile was able to
write to a file but not read from the same file. This change let me move
past that and it did not break any tests in the planfile package, but I
am concerned that it introduces undesired behavior.