Previously we relied on a constellation of coincidences for everything to
work out correctly with state serials. In particular, callers needed to
be very careful about mutating states (or not) because many different bits
of code shared pointers to the same objects.
Here we move to a model where all of the state managers always use
distinct instances of state, copied when WriteState is called. This means
that they are truly a snapshot of the state as it was at that call, even
if the caller goes on mutating the state that was passed.
We also adjust the handling of serials so that the state managers ignore
any serials in incoming states and instead just treat each Persist as
the next version after what was most recently Refreshed.
(An exception exists for when nothing has been refreshed, e.g. because
we are writing a state to a location for the first time. In that case
we _do_ trust the caller, since the given state is either a new state
or it's a copy of something we're migrating from elsewhere with its
state and lineage intact.)
The intent here is to allow the rest of Terraform to not worry about
serials and state identity, and instead just treat the state as a mutable
structure. We'll just snapshot it occasionally, when WriteState is called,
and deal with serials _only_ at persist time.
This is intended as a more robust version of #15423, which was a quick
hotfix to an issue that resulted from our previous slopping handling
of state serials but arguably makes the problem worse by depending on
an additional coincidental behavior of the local backend's apply
implementation.
Gove LockInfo a Marshal method for easy serialization, and a String
method for more readable output.
Have the state.Locker implementations use LockError when possible to
return LockInfo and an error.
Have LocalState store and check the lock ID, and strictly enforce
unlocking with the correct ID.
This isn't required for local lock correctness, as we track the file
descriptor to unlock, but it does provide a varification that locking
and unlocking is done correctly throughout terraform.
During backend initialization, especially during a migration, there is a
chance that an existing state could be overwritten.
Attempt to get a locks when writing the new state. It would be nice to
always have a lock when reading the states, but the recursive structure
of the Meta.Backend config functions makes that quite complex.
Close and remove the file descriptor from LocalState if we Unlock the
state. Also remove an empty state file if we created it and it was never
written to. This is mostly to clean up after tests, but doesn't hurt to
not leave empty files around.
The old behavior in this situation was to simply delete the file. Since
we now have a lock on this file we don't want to close or delete it, so
instead truncate the file at offset 0.
Fix a number of related tests
Use a DynamoDB table to coodinate state locking in S3.
We use a simple strategy here, defining a key containing the value of
the bucket/key of the state file as the lock. If the keys exists, the
locks fails.
TODO: decide if locks should automatically be expired, or require manual
intervention.
Add the LockUnlock methods to LocalState and BackupState.
The implementation for LocalState will be platform specific. We will use
OS-native locking on the state files, speficially locking whichever
state file we intend to write to.