Re: [isabelle] sloccount for .thy files?



On Wed, 1 Aug 2012, Rafal Kolanski wrote:

I have ran your program on the attached Scratch.thy

Your example contains various Isar markup commands like this:

  section {* ... *}
  text {* ... *}

These are not comments, they are part of the formal text (and they consist of more than one token anyway).


You could filter out both (* ... *) and {* ... *} tokens like this:

  if !(token.is_comment || token.kind == Token.Kind.VERBATIM)

Although that is not very precise, since verbatim tokens can occur in other non-markup commands as well, say 'ML' or 'method_setup'.


It requires a little bit of outer syntax parsing to detect

  MARKUP_COMMAND VERBATIM_TOKEN

separated by whitespace or comments (or tags). See src/Pure/Thy/thy_output.ML:present_thy/markup for the fully precise way to detect that form.


To discriminate markup command tokens in Scala you need information from the outer syntax taken from the prover session, lets say from Isabelle/Pure or HOL as approximation (assuming that the application does not introduce its own markup commands). The following table has been guessed from a a search for Outer_Syntax.markup_command on the ML sources:

  val markup_command =
    Set(
      "header",
      "chapter",
      "section",
      "subsection",
      "subsubsection",
      "text",
      "text_raw",
      "sect",
      "subsect",
      "subsubsect",
      "txt",
      "txt_raw")

This set can then be used like a predicate in Scala, with normal application notation:

  token.is_command && markup_command(token.content)


Then you probably want to use parser combinators from the Scala library to work out some structure of the Isar token list. See also src/Pure/Isar/parse.scala and some other places that use that Parse module.


Moreover, you probably also want to take care of vacous lines after the stripping.

It depends what you want to count and to achieve in the end, how much precision you want to apply.


	Makarius





This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.