[Sigsum-general] Re: Sigsum proof specification

7 Jan 2025

      Simon Josefsson via Sigsum-general sigsum-general@lists.sigsum.org
writes:
...

Suggest a filename extension

It seems some people use *.proof although *.sigsum-proof may be more
advertizy.  Or just *.sigsum?
Naming is somewhat hard... On one hand, I like the very explicit
.sigsum-proof, but it would also be nice with something shorter.
...

Suggest a filename naming convention

It should also suggest that the common way to name a Sigsum proof file
is to name it after the file it contains a proof for, and include an
example like:
hello-2.1.3.tar.gz
hello-2.1.3.tar.gz.proof
Sounds reasonable as an example, but not sure it needs a stronger
recommendation than that. And behavior of the sigsum-submit tool should
be consistent with whatever convention is documented.
Also keep in mind that a proof could refer to other kinds of objects
than named files, so this is a "special case", although a very common
case.
...

Specify a MIME media subtype.  I suggest "text/sigsum-proof".

To be a clear MIME media subtype specification it should discuss

character set encoding concerns.  The document already refer to ASCII
and I suggest making this even more explicit: Sigsum proof files MUST be
7-bit clear ASCII files and MUST NOT contain any byte with the high bit
set.
Makes sense. To be explicit, does this mean that you suggest MIME type
"text/sigsum-proof; charset=ascii" ?
...

Add a ABNF grammar describing the format.

What concrete utility do you see? If we adopt ABNF, we should consider
adding that also to
https://git.glasklar.is/sigsum/project/documentation/-/blob/main/log.md.
...

Discuss how to handle non-compliant data.  For example is a "#"

comment line allowed?  Is adding/removing whitespace allowed?  CRLF vs
CR vs LF vs NUL etc delimiters?
Besides possibly being more liberal regarding line end convention (see
below), I see no reason to allow white space variations or comments, do
you?
The intention of current spec and implementation is to require a single
newline character (0xa) terminating each line.  Changing that would be
another change to the format.
But for a text/* content type, I would expect the local line end
convention to be accepted, which in a networked setting means one would
have to accept all line end variants. Which might be an argument against
using a text/* type? But I don't know the fine details of the text/*
expectations.
...

Putting the text into an IETF draft would be useful, as a reference

for the MIME media subtype registration and a file format reference.
I'm sure you know the process, but I'm happy to put this together and
submit it if you want.
To me, an internet draft makes sense if and only if we intend to publish
it as an (informational) RFC. Internet drafts are, by definition, not
great references.
...

Versioning... the following document makes me a little nervous that

the file format is still in flux which is detrimental for deployment:
https://git.glasklar.is/sigsum/project/documentation/-/blob/main/proposals/2...
Given the very preliminary deployment of version 1, I would like to
think about that as optional for implementations, and that coming
deployment should mandate version 2. I would expect sigsum to stick to
version 2 until some "spicy signature" that is not specific to sigsum
logs emerges (but we should not define MIME types in such a way that we
completely rule out a hypothetical sigsum proof version 3).
So I think MIME registration or other standards action should ignore
version 1, or document it as a historic variation. While our tools and
libraries will support reading version 1 for as long as needed.
...
It may be useful to discuss if all file format versions are using the
same filename extension, convention, MIME media sub-type, and if so any
discussion how entities should behave when parsing and generating files.
I think there are two options: 1) Pretend version 1 never existed and
just remove all support for it. 2) Document that applications MUST
generate version 2 format, and applications MUST handle both formats and
MUST discard the short 'leaf' checksum.
How is this handled for other formats for which there are variations,
e.g., multiple versions, or optional features? Are those reflected in
the MIME type (or extension), or is it enough that the MIME type tells
an implementation unambiguously how to extract information about version
and features from the content data? Off the top of my head, having
variation reflected in the MIME type would mainly be useful for content
type negotiation like the Accept: header (which as far as I'm aware is
rare in practice, and not obviously useful for the case of sigsum
proofs).
Regards,
/Niels

2025

2024

2023

2022

2021

[Sigsum-general] Re: Sigsum proof specification