[Sigsum-general] Re: Sigsum proof specification

7 Jan 2025

      Niels Möller via Sigsum-general sigsum-general@lists.sigsum.org
writes:
...
Simon Josefsson via Sigsum-general sigsum-general@lists.sigsum.org
writes:
...

Suggest a filename extension

It seems some people use *.proof although *.sigsum-proof may be more
advertizy.  Or just *.sigsum?
Naming is somewhat hard... On one hand, I like the very explicit
.sigsum-proof, but it would also be nice with something shorter.
Maybe canonical name *.sigsum-proof and a short form like *.ssp (SigSum
Proof), *.prf (sigsum PRooF), *.spf (Sigsum ProoF), *.sps (Sigsum Proof
Signature), *.sis (SIgsum Signature), *.ssi (Sigsum SIgnature), ...?
I think a three character extension would be nice.  I'm currently
considering doing some software release announcements with sigsum proofs
for the artifacts, and the verification instructions and filename
extension/convention are the primary unclear parts now.
...
...

Suggest a filename naming convention

It should also suggest that the common way to name a Sigsum proof file
is to name it after the file it contains a proof for, and include an
example like:
hello-2.1.3.tar.gz
hello-2.1.3.tar.gz.proof
Sounds reasonable as an example, but not sure it needs a stronger
recommendation than that. And behavior of the sigsum-submit tool should
be consistent with whatever convention is documented.
Also keep in mind that a proof could refer to other kinds of objects
than named files, so this is a "special case", although a very common
case.
Yes I am mostly looking for a style guide rather than any exclusionary
requirement here.  What I would dislike is if any of these starts to be
common:
hello-2.1.3.tar.gz-proof
hello-2.1.3.tar.gz-sigsum
sigsum-hello-2.1.3.tar.gz
hello-2.1.3-sigsum.tar.gz
I realize now that the sigsum-submit --help is already fairly clear:
If input files are provided on the command line, each file
    corresponds to one request, and result is written to a
    corresponding output file, based on these rules:
1. If there's exactly one input file, and the -o option is used,
       output is written to that file. Any existing file is overwritten.
2. For a request output, the suffix ".req" is added to the input
       file name.
3. For a proof output, if the input is a request, any ".req"
       suffix on the input file name is stripped. Then the suffix
       ".proof" is added.
4. If the --output-dir option is provided, any directory part of
       the input file name is stripped, and the output is written as a
       file in the specified output directory.
If a corresponding .proof file already exists, that proof is read
    and verified. If the proof is valid, the input file is skipped. If
    the proof is not valid, sigsum-submit exits with an error.
If a corresponding .req output file already exists, it is
    overwritten (TODO: Figure out if that is the proper behavior).
...
...

Specify a MIME media subtype.  I suggest "text/sigsum-proof".

To be a clear MIME media subtype specification it should discuss

character set encoding concerns.  The document already refer to ASCII
and I suggest making this even more explicit: Sigsum proof files MUST be
7-bit clear ASCII files and MUST NOT contain any byte with the high bit
set.
Makes sense. To be explicit, does this mean that you suggest MIME type
"text/sigsum-proof; charset=ascii" ?
I think the MIME world is quite complex so it is hard to answer.  My
point is that there should be a MIME type like 'text/sigsum-proof' that
has a well-defined (preferably ASCII-based) syntax associated with it.
Reading https://datatracker.ietf.org/doc/html/rfc6838#section-4.2.1 and
https://datatracker.ietf.org/doc/html/rfc6657 makes me prefer to say
that the charset parameter is not used because the format is ASCII.
...
...

Add a ABNF grammar describing the format.

What concrete utility do you see? If we adopt ABNF, we should consider
adding that also to
https://git.glasklar.is/sigsum/project/documentation/-/blob/main/log.md.
My primary utility of doing that is to lock down the format so we won't
have ten slightly different variants of it.  And alignment with the
MIME/IETF registration process.
...
...

Discuss how to handle non-compliant data.  For example is a "#"

comment line allowed?  Is adding/removing whitespace allowed?  CRLF vs
CR vs LF vs NUL etc delimiters?
Besides possibly being more liberal regarding line end convention (see
below), I see no reason to allow white space variations or comments, do
you?
No.  I was playing devils advocate.
...
The intention of current spec and implementation is to require a single
newline character (0xa) terminating each line.  Changing that would be
another change to the format.
But for a text/* content type, I would expect the local line end
convention to be accepted, which in a networked setting means one would
have to accept all line end variants. Which might be an argument against
using a text/* type? But I don't know the fine details of the text/*
expectations.
https://datatracker.ietf.org/doc/html/rfc2046#section-4.1.1 says
The canonical form of any MIME "text" subtype MUST always represent a
   line break as a CRLF sequence.
Apparently this doesn't prevent using text/plain on LF-delimited files,
and that seems better than using application/sigsum-proof for what is
essentially text anyway.
...
...

Putting the text into an IETF draft would be useful, as a reference

for the MIME media subtype registration and a file format reference.
I'm sure you know the process, but I'm happy to put this together and
submit it if you want.
To me, an internet draft makes sense if and only if we intend to publish
it as an (informational) RFC. Internet drafts are, by definition, not
great references.
An Informational RFC would be nice, although strictly not required.
Instead you could prepare a *.md file specifying things and then fill
out this form:
https://www.iana.org/form/media-types
...
...

Versioning... the following document makes me a little nervous that

the file format is still in flux which is detrimental for deployment:
https://git.glasklar.is/sigsum/project/documentation/-/blob/main/proposals/2...
Given the very preliminary deployment of version 1, I would like to
think about that as optional for implementations, and that coming
deployment should mandate version 2. I would expect sigsum to stick to
version 2 until some "spicy signature" that is not specific to sigsum
logs emerges (but we should not define MIME types in such a way that we
completely rule out a hypothetical sigsum proof version 3).
So I think MIME registration or other standards action should ignore
version 1, or document it as a historic variation. While our tools and
libraries will support reading version 1 for as long as needed.
Okay - having versioning in the format specification is fine, it could
simply say that anything except 'version=2' is undefined behaviour.
...
...
It may be useful to discuss if all file format versions are using the
same filename extension, convention, MIME media sub-type, and if so any
discussion how entities should behave when parsing and generating files.
I think there are two options: 1) Pretend version 1 never existed and
just remove all support for it. 2) Document that applications MUST
generate version 2 format, and applications MUST handle both formats and
MUST discard the short 'leaf' checksum.
How is this handled for other formats for which there are variations,
e.g., multiple versions, or optional features? Are those reflected in
the MIME type (or extension), or is it enough that the MIME type tells
an implementation unambiguously how to extract information about version
and features from the content data? Off the top of my head, having
variation reflected in the MIME type would mainly be useful for content
type negotiation like the Accept: header (which as far as I'm aware is
rare in practice, and not obviously useful for the case of sigsum
proofs).
I believe most formats specify one MIME type once and then do version
rolling inside the format specifications.  Introducing new MIME types
for each new format version is more fragile, and usually doesn't give
any advantages.
/Simon

2025

2024

2023

2022

2021

[Sigsum-general] Re: Sigsum proof specification