Re: [isabelle] [isabelle-dev] copying x-symbols from jEdit has an issue

On Thu, 21 Jun 2012, Nadezhda Baklanova wrote:

Dear all,

there is a minor issue in jEdit IDE: when text is copied from jEdit and pasted anywhere except jEdit (any other text editor), x-symbols are copied as unicode characters but not as Isabelle codes. For example, lambda is pasted as Greek character lambda and not as \<lambda>.

If the file with copied text is later opened in emacs ProofGeneral, there are lots of parsing errors at these unicode characters.

(This got stuck on isabelle-dev for quite some time, where it is basically off-topic. See also the brief descriptions of the mailing lists on the main Isabelle website.)

Concerning "x-symbols", there is quite a bit of confusion, including this terminology, which stems from an obsolete Emacs package of that name from a long time ago. Even in current Proof General 4.x there are no x-symbols anymore, but Unicode characters for rendering the special Isabelle symbols like \<alpha>, \<forall> etc.

In contrast to Proof General 4.x, Isabelle/jEdit (and the underlying Isabelle/Scala) platform do the rendering of symbols in physical unicode on the JVM by actual translation back and forth, now just by appearance. This then allows to copy-paste in the front-end space, including other front-end renderings like the Isabelle HTML output. It also allows the editor to search over files using the unicode that you see on the surface, although the prover sources are still in this encoding-free ASCII format of Isabelle symbols. (The latter goes back to Isabelle98, and has remained mostly stable since then, while there were several different Unicode standards coming and going in the meantime.)

So far so good. Problems arise when you mix up front-end views (with physical unicode rendering) and back-end sources that are given to the prover by a side entry, without a Isabelle/Scala compliant frontend in between. If you produce such a unicode .thy file accidentally, you can convert it by opening in Isabelle/jEdit and saving it again with the UTF8-Isabelle encoding that should be enabled by default. (jEdit also allows to switch encodings explicitly, and tries to remember that choice for later.)

Technically that encoding is managed by jEdit, independently of the raw Java VM encoding concept for text. At some point, I might move the UTF8-Isabelle conversion down to the JVM itself. I've also heard that the JVM clipboard can be somehow made sensitive to encodings, with potential alternatives. The latter might make things work yet more smoothly in the future, but it needs further investigation of Unicode on the JVM (which is an endless topic).


This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.