Re: [isabelle] Supporting native Unicode input in Isabelle/jEdit
On Thu, 28 Feb 2013, Christoph LANGE wrote:
I wonder why Isabelle/jEdit does not support native Unicode input, but
only uses Unicode for display?
Taking this question in isolation, it actually should work both for input
and display. The prover sources are stored in a certain text format as
Isabelle "symbols", but that is treated as "encoding" by jEdit, so you
will edit the text in UTF-16 of the JVM. This is not fundamentally
different from having external files in UTF-8, UTF-16, UTF-32 etc. all
recoded to the UTF-16 of Java before editing it.
It is up to the JVM, jEdit, or other input methods to produce such unicode
glyphs. Copy-and-paste should also work as long as you stay within the
Unicode image of the prover sources.
Just yesterday, I used this myself two times:
* In one of the private mails to you, I made some Isar proofs that were
copied from Isabelle/jEdit into the terminal where my mail client was
running. This worked sometimes but not always: the combination of
terminal + alpine had sometimes problems coping with the multibyte UTF-8
version of the text, so in the end I was not 100% sure if you actually
received the proof in a way I had in mind.
* Posting a small Isar proof on stackoverflow, via the same copy-paste
technique JVM -> Firefox, it worked without problems.
Nowadays, where Unicode is widely supported
There is a long story behind that, some of it is discussed in my MKM 2011
paper, see section 2.1 in
In short, I do not subscribe to the "universality" of Unicode, there are
just a bit too many problems with it. The approach in Isabelle/Scala and
Isabelle/jEdit is to use it as "poor man's mathematical rendering" in the
front-end -- so your funny keyboard should work with it after some
convincing. The prover itself is free from Unicode worries, and the
ongoing evolution of the Unicode standards in the past decades. Theory
sources are stored persistenly in a unicode-free manner, but it tolerates
native UTF-8 to some extent.
After the experience of struggling with the many faces of Unicode in the
past 3 years, I would go as far to say that an LCF-style prover should
*not* allow Unicode at its kernel. There are just too many dangers of
getting things utterly wrong (e.g. code-points being swapped or normalized
according to strict rules of the standard, which are sometimes implemented
in this way or that way.)
This archive was generated by a fusion of
Pipermail (Mailman edition) and