Previously our file hashing functions were backed by the same "read file
into memory" function we use for situations like "file" and "templatefile",
meaning that they'd read the entire file into memory first and then
calculate the hash from that buffer.
All of the hash implementations we use here can calculate hashes from a
sequence of smaller buffer writes though, so there's no actual need for
us to create a file-sized temporary buffer here.
This, then, is a small refactoring of our underlying function into two
parts, where one is responsible for deciding the actual filename to load
opening it, and the other is responsible for buffering the file into
memory. Our hashing functions can then use only the first function and
skip the second.
This then allows us to use io.Copy to stream from the file into the
hashing function in smaller chunks, possibly of a size chosen by the hash
function if it happens to implement io.ReaderFrom.
The new implementation is functionally equivalent to the old but should
use less temporary memory if the user passes a large file to one of the
hashing functions.
Previously this function only supported the x509 RSA private key format.
More recent versions of OpenSSH default to generating a new PEM key
format, which this commit now supports using the x/crypto/ssh package.
Also improve the returned error messages for various invalid ciphertext
or invalid private key errors.
In prior versions, we recommended using hash functions in conjunction with
the file function as an idiom for detecting changes to upstream blobs
without fetching and comparing the whole blob.
That approach relied on us being able to return raw binary data from
file(...). Since Terraform strings pass through intermediate
representations that are not binary-safe (e.g. the JSON state), there was
a risk of string corruption in prior versions which we have avoided for
0.12 by requiring that file(...) be used only with UTF-8 text files.
The specific case of returning a string and immediately passing it into
another function was not actually subject to that corruption risk, since
the HIL interpreter would just pass the string through verbatim, but this
is still now forbidden as a result of the stricter handling of file(...).
To avoid breaking these use-cases, here we introduce variants of the hash
functions a with "file" prefix that take a filename for a disk file to
hash rather than hashing the given string directly. The configuration
upgrade tool also now includes a rule to detect the documented idiom and
rewrite it into a single function call for one of these new functions.
This does cause a bit of function sprawl, but that seems preferable to
introducing more complex rules for when file(...) can and cannot read
binary files, making the behavior of these various functions easier to
understand in isolation.
These all follow the pattern of creating a hash and converting it to a
string using some encoding function, so we can write this implementation
only once and parameterize it with a hash factory function and an encoding
function.
This also includes a new test for the sha512 function, which was
previously missing a test and, it turns out, actually computing sha256
instead.
This is based on c811440188 made against the
old "config" package implementations, but also catches a few other cases
where we would previously have printed the private key into the error
messages.
These implementations are adaptations of the existing implementations in
config/interpolate_funcs.go, updated to work with the cty API.
The set of functions chosen here was motivated mainly by what Terraform's
existing context tests depend on, so we can get the contexts tests back
into good shape before fleshing out the rest of these functions.