Rasmus Dahlberg <rasmus.dahlberg(a)glasklarteknik.se> writes:
>> I see one problem with this, though. The monitor can't simply use the
>> ssh-keygen command to verify the signature, since that command expects
>> to get the *message* as input, not the hash thereof. Which kind-of
>> defeats the idea of piggybacking on ssh tools.
> I disagree. The value of piggy-backing on SSH tooling is for the signer
> who can access their private key with good solutions that already exist.
I don't have a very strong opinion, maybe it's "only" a matter of
documentation. If we say "sigsum uses the ssh signature format", I would
expect that to mean that we use it as a black box with inputs and output
according to the spec, but that's not quite what we do.
We could say that we have
checksum = H(message)
and that the essence of the signature is to sign checksum, so that
signature can be verified given only the public key and checksum (i.e.,
without knowing message). And that we do signatures in a way that
produces an identical signature as if feeding message as the input of
ssh-keygen -Y sign --hashalg=sha256, or M = message in its spec.
>> message=SHA3(data) ; application layer
> Note that message must be exactly 32 bytes in Sigsum. So, you wouldn't
> be able to use SHA3 here
Side note: SHA3_256 is part of the SHA3 standard.
> I see your point if it is a desired property to verify leaf signatures
> in isolation with ssh-keygen. Would you say that the complexity is
> decreased, about the same, or increased if this change was proposed?
I think conceptual complexity would be decreased slightly, by which I
mean roughly that it will be slightly easier to document and explain.
For implementation complexity, not much difference (assuming that the
only value we see in actually using the ssh-keygen tool for handling
signatures is for the private key operations).
Let me start with a description of my current understanding of the
checksum field, present in the tree_leaf struct in the spec.
The submitter submits a message M to the log (M typically a hash of some
data not disclosed to the log), together with a public key and
The log first verifies the signature, and then adds a leaf to the merkle
tree. The signatures are done using ssh format, configured to use
sha256. This implies that sign and verify operations on which includes
computing SHA256(M), and we call this "checksum" and include it in the
tree_leaf struct together with the signature.
My first question: Does it matter in any way that the checksum happens
to be a value used internally in the ssh signature formatting?
If we instead publish a signature of M created using the SHA512 hash
internally, as is the ssh-keygen -Y sign default, and publish this
signature together with checksum = SHA256(M), wouldn't that work just as
Next question: Do we really need to publish the checksum at all? It
serves as a unique and random-looking identifier for the message M, but
who's using this id? We have the following roles:
1. Submitter. Will collect signatures on the submitted leaf. Obviously
knows everything needed to query for the leaf hash, and will crete
the "sigsum proof package" package to distribute to sigsum verifiers.
2. Sigsum verifier (the party that gets the message M and wants to
verify that it is properly logged). The verifier needs to get (by
other means than querying the log by itself) all of
signature of M
inclusion proof for the leaf including this signature
all related public keys
As far as I see, signer will clearly recompute the checksum, in the
internals of the signature verification. It could also explicitly
compare the checksum it to the value stored in the leaf, but what
benefit does that give, if the signature is already verified? On the
other hand, it seems essential to verify that the signature and the
public key hash in the leaf are as expected.
I think this is the core of the question: Is there any reason for the
verifier to validate the checksum stored in the leaf, in addition
to verifying the signature? If not, what use is that field?
3. Witness. The witness doesn't have access to M, so to a witness the
checksum is just random string, there's no way to validate it. It
could possibly use it to verify the signature (not by using
ssh-keygen though, but by digging into internals if ssh signatures),
except that the witness is not expected to have access to the
submitter's public key.
4. Monitors. The purpose of a monitor is to query the log and alert
whenever an unexpected signature appears. My understanding of
monitoring is somewhat fuzzy, but I think the monitor is expected to
query for recent tree heads (relying on witness cosignatures to know
that what it gets is recent), download all (new) leaves in the tree,
and filter on one or more public key hashes of interest. For the
leaves found, it will then alert key owner on "unexpected" checksums.
However, couldn't one do without the checksums and just as well look
at unexpected signatures?
The checksum uniquely (except for hash collisions) identifies a single
message M. But the signature itself also uniquely identifies a single
message M (it seems highly unlikely to have collisions, even if we allow
the public key to vary, and in case we insist on having the same public
key, any collision represents a break of the security of the signature
The difference is that the checksum can be (re)computed from M only,
while computing the signature also requires the private key. That's
sonds like a big dfference, but the only roles above that are expected
to know M, are the submitter and the verifier. The submitter by
definition knows the private key. And the verifier should be provided
with the signature by other means, and just verify it.