Re: [isabelle] Unicode (Was: Update on I3P)



Hi,

Am Freitag, den 10.09.2010, 18:20 +0200 schrieb Makarius:
> On Fri, 10 Sep 2010, Joachim Breitner wrote:
> > I should add that this sequence of four characters is the utf8
> > representation of 𝒵 (\<Z>), re-encoded as utf8:
> > $ echo -n ð??µ|iconv -futf8 -tlatin1
> > 𝒵
> > $ echo -n 𝒵 | iconv -flatin1 -tutf8
> > ð??µ
> >
> > Maybe i3p has problems with unicode character code points that do not
> > fit in one 16-bit number?
> 
> This is a deeper problem of the way Unicode works on the JVM, see also 
> http://download.oracle.com/javase/6/docs/api/java/lang/Character.html
> 
> Few Java applications get this specification of "surrogate code points" 
> right, because this situation occurs rarely.  UTF-8 works generally better 
> because the multi-character sequences are the rule, not the exception.
> 
> It is unlikely that this will get fundamentally better on the JVM, 
> although some people have already started joking that Oracle will soon 
> identify "" and null for type String ...
> 
> The easyiest fix is to remove these non-16bit codes from the table of 
> Isabelle symbol interpretation.

Yes, this is a work-around (although I had to edit the copy of the
symbols file in i3p/modules/org-i3p-fontsupport.jar). A proper fix would
be nice, though – Unicode is just too shiny :-)

Greetings,
Joachim

-- 
Joachim Breitner
  e-Mail: mail at joachim-breitner.de
  Homepage: http://www.joachim-breitner.de
  ICQ#: 74513189
  Jabber-ID: nomeata at joachim-breitner.de

Attachment: signature.asc
Description: This is a digitally signed message part



This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.