Sorry for the delay in answering. Investigating Nitpick failures is tricky, especially that I developed the tool over 10 years ago.
I wasn't quite able to put my finger on the deep problem -- perhaps simply an issue in the translation from HOL terms to first-order relational logic (Kodkod's logic), which is optimized and hence quite subtle. However, while investigating the issue, I rephrased one of the codatatype constraints, hoping this would make the problem go away, and it did (at least with the "box" option on). This is not entirely satisfactory, but my middle-term goal is to replace Nitpick with the experimental Nunchaku, so I don't want to spend too much time on Nitpick.
If you wonder, the issue seems to be that the (lazy) lists [a] and [a, a] are considered bisimilar even if they are not. The bisimilarity check has a depth (bisim_depth), and indeed with bisim_depth = 1 we'd expect [a] and [a, a] to be identical. But here we're working with bisim_depth = 2, and hence they're different and thus it should be possible to construct a model that contains both. (The list [a] is needed to build [a, a], so it's unavoidable in a model.)