Regarding https://arxiv.org/abs/2006.09265, I was writing a parser for .thy files in Python to extract information. The code should be available from the supplementary material of the ICLR submission (https://openreview.net/forum?id=Pzj6fzU6wkj). The framework is hacky and fragile but may be OK for early experimentation or quick prototyping. Feel free to reuse it as you see fit.
For serious and robust development, I, too, believe the Isabelle/Scala interface is the best option. Dominique’s scala-isabelle library (https://github.com/dominique-unruh/scala-isabelle) could be of great use.