Re: [isabelle] Allowed characters in theory names for document build

On 09/10/17 21:58, Lars Hupel wrote:
> [I took the liberty of moving this to isabelle-users]

Great. Now we can discuss this properly.

>> Isabelles pdflatex document build fails when there is a space or any of
>> Ã, Ã, Ã, Ã and % in a theories name. Possibly more characters are
>> affected.
> I would strongly recommend using plain ASCII characters for theory
> names. Many things might break in unexpected things (apart from document
> preparation) when non-ASCII characters are used.

Theory names live in different formal domains, and certain restrictions
on the name syntax can be derived from that.

  * Outer syntax: syntax category "name" (cf. isar-ref manual), which
means the quoted form can be almost anything, only excluding corner
cases of string representation (e.g. null characters, control
characters, odd Unicode sequences).

  * Inner syntax (via theory name space qualifiers): syntax category
"id" or "short_ident" (cf. isar-ref manual). Otherwise it is in general
not possible to write qualified constant names like
My_Theory.my_constant. It is very unfriendly to produce a theory where
users cannot refer to qualified names within terms later.

  * Formal theory header, both in Isabelle/ML and Isabelle/Scala: there
are differences in string representation (UTF-8 vs. UTF-16) and
sometimes the difference cannot be fully hidden. Note that it is
generally a bad idea to use raw Unicode for names in Isabelle, only
proper Isabelle symbols like \<alpha> or \<forall>.

  * File-system name space on Linux, Mac OS X, Windows, Cygwin. This
imposes various real-world restrictions. Isabelle symbols are excluded
because of the use of the backslash required for Windows. Unicode often
works, but not universally on all platforms in all situations. Spaces
work most of the time, but are very bad style on Unix platforms. (There
is a funny problem in the Cygwin version bundled with Isabelle2016-1 and
Isabelle2017: if Isabelle is installed into a directory name like "test
a" or "testÃ" it works, but if it is "test Ã" it doesn't. Maybe we
should start a thread on the Cygwin mailing list about that Unicode oddity).

Intersecting all these side-conditions, the remainder is indeed a plain
ASCII identifier consisting of: A-Z a-z 0-9 _ (starting with a letter).

Moreover, there is a hard and fast convention to use upper-case words in
singular that are separated by underscores (some particles might not
count as words and remain lower-case). Plural means that several
concepts of the same kind are covered.


  theory Nat   -- a theory of natural numbers
  theory List  -- a theory of lists
  theory Coinductive_List  -- a theory of coinductive lists
  theory Orders  -- a theory with several kinds of orders (plural)
  theory Misc_List_Lemmas  -- a theory with various list lemmas (plural)
  theory Fundamental_Theorem_of_Algebra  -- particle in lower-case

A common mistake is to use plural too often, e.g. a theory "Lists" of
lists would provide a constant "", but this should be
"" instead.

Note that the situation is similar to type names: the type of lists is
called "list" in singular (and in lower-case).

>> To make Isabelle and its document build feature more permissive with
>> filenames, what is the best way to solve these issues?
> I don't think this is expected to work, hence I see very little prospect
> to solve this in the future.

Right. At some point, the system should check that explicitly and reject
strange theory names outright.

The Prover IDE could also provide more hints about further fine-points.


This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.