Basic Debug Combinators (parsley.debug)

Parsley has a collection of basic debugging utilities found within parsley.debug. These can help debug errant parsers, understand how error messages have been generated, and provide a rough sense of what parsers take the most time to execute.

The combinators themselves are all contained within parsley.debug.DebugCombinators.

The Scaladoc for this page can be found at parsley.debug and parsley.debug.DebugCombinators

Debugging Problematic Parsers (debug)

The most common quick debugging combinator is debug, which at its simplest prints some information on entering and exiting a combinator:

import parsley.Parsley.atomic
import parsley.character.string
import parsley.debug, debug._

val hello = ( atomic(string("hello").debug("hello")).debug("atomic1")
            | string("hey").debug("hey")
            | string("hi").debug("hi")
            )
// hello: parsley.Parsley[String] = parsley.Parsley@57bf6b19

debug.disableColourRendering()

hello.parse("hey")
// >atomic1> (1, 1): hey•
//                   ^
//   >hello> (1, 1): hey•
//                   ^
//   <hello< (1, 3): hey• Fail
//                     ^
// <atomic1< (1, 1): hey• Fail
//                   ^
// >hey> (1, 1): hey•
//               ^
// <hey< (1, 4): hey• Good
//                  ^
// res1: parsley.Result[String, String] = Success(hey)

hello.parse("hi")
// >atomic1> (1, 1): hi•
//                   ^
//   >hello> (1, 1): hi•
//                   ^
//   <hello< (1, 2): hi• Fail
//                    ^
// <atomic1< (1, 1): hi• Fail
//                   ^
// >hey> (1, 1): hi•
//               ^
// <hey< (1, 2): hi• Fail
//                ^
// res2: parsley.Result[String, String] = Failure((line 1, column 1):
//   unexpected "hi"
//   expected "hello" or "hey"
//   >hi
//    ^^)

In the above example, an unexpected failure to parse the input "hi" is being debugged using debug. Each of the string combinators are annotated, as well as the atomic. This allows us to see the control flow of the parser as it executes, as well as where the input was read up to. In this case, we can see that atomic has undone input consumption, but that doesn't apply to hey. In other words, there is another atomic missing! We could have added debug to the | combinator as well to make it even clearer, of course. Though not visible above, the output is usually coloured. If this causes problems, debug.disableColourRendering() will disable it, or the coloured parameter can be set to false on the combinator.

Breakpoints

The Breakpoint type has values EntryBreak, ExitBreak, FullBreak, and NoBreak, which is the default. FullBreak has the effect of both EntryBreak and ExitBreak combined: EntryBreak will pause execution on the entry to the combinator, requiring input on the console to proceed, and ExitBreak will do the same during the exit.

Watching References

The debug combinator takes a variadic number of reference/name pairs as its last argument. These allow you to watch the values stored in references as well during the debugging process. For instance:

import parsley.Parsley.atomic
import parsley.state._
import parsley.character.string
import parsley.debug._

val p = 0.makeRef { r1 =>
    false.makeRef { r2 =>
        val p = (  string("hello")
                ~> r1.update(_ + 5)
                ~> ( string("!")
                   | r2.set(true) ~> string("?")
                   ).debug("punctuation", r2 -> "r2")
                )
        r1.rollback(atomic(p).debug("hello!", r1 -> "r1", r2 -> "r2"))
          .debug("rollback", r1 -> "r1")
    }
}
// p: parsley.Parsley[String] = parsley.Parsley@558063b4

p.parse("hello world")
// >rollback> (1, 1): hello·
//                    ^
// watched registers:
//     r1 = 0
// 
//   >hello!> (1, 1): hello·
//                    ^
//   watched registers:
//       r1 = 0
//       r2 = false
//   
//     >punctuation> (1, 6): hello·world•
//                                ^
//     watched registers:
//         r2 = false
//     
//     <punctuation< (1, 6): hello·world• Fail
//                                ^
//     watched registers:
//         r2 = true
//     
//   <hello!< (1, 1): hello· Fail
//                    ^
//   watched registers:
//       r1 = 5
//       r2 = true
//   
// <rollback< (1, 1): hello· Fail
//                    ^
// watched registers:
//     r1 = 0
// 
// res3: parsley.Result[String, String] = Failure((line 1, column 6):
//   unexpected space
//   expected "!" or "?"
//   >hello world
//         ^)

Debugging Error Messages (debugError)

The debugError is a slightly more experimental combinator that aims to provide some (lower-level) insight into how an error message came to be. For instance:

import parsley.character.{letter, digit, char}
import parsley.Parsley.many
import parsley.debug._

val q = (many( ( digit.debugError("digit")
               | letter.debugError("letter")
               ).debugError("letter or digit")
             ).debugError("many letterOrDigit")
      ~> many(char('@').debugError("@")).debugError("many @")
      ~> char('#').debugError("#")) | char('!').debugError("!")
// q: parsley.Parsley[Char] = parsley.Parsley@284e9cc8

q.parse("$")
// >many letterOrDigit> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
//   >letter or digit> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
//     >digit> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
//     <digit< (offset 0, line 1, col 1): Fail
//     generated vanilla error (offset 0, line 1, col 1) {
//       unexpected item = "$"
//       expected item(s) = Set(digit)
//       reasons = no reasons given
//     }
//     >letter> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
//     <letter< (offset 0, line 1, col 1): Fail
//     generated vanilla error (offset 0, line 1, col 1) {
//       unexpected item = "$"
//       expected item(s) = Set(letter)
//       reasons = no reasons given
//     }
//   <letter or digit< (offset 0, line 1, col 1): Fail
//   generated vanilla error (offset 0, line 1, col 1) {
//     unexpected item = "$"
//     expected item(s) = Set(letter, digit)
//     reasons = no reasons given
//   }
// <many letterOrDigit< (offset 0, line 1, col 1): Good, current hints are Set(letter, digit) with all added since entry to debug (valid at offset 0)
// >many @> (offset 0, line 1, col 1): current hints are Set(letter, digit) (valid at offset 0)
//   >@> (offset 0, line 1, col 1): current hints are Set(letter, digit) (valid at offset 0)
//   <@< (offset 0, line 1, col 1): Fail
//   generated vanilla error (offset 0, line 1, col 1) {
//     unexpected item = "$"
//     expected item(s) = Set("@", letter, digit)
//     reasons = no reasons given
//   }
// <many @< (offset 0, line 1, col 1): Good, current hints are Set("@", letter, digit) with Set("@") added since entry to debug (valid at offset 0)
// >#> (offset 0, line 1, col 1): current hints are Set("@", letter, digit) (valid at offset 0)
// <#< (offset 0, line 1, col 1): Fail
// generated vanilla error (offset 0, line 1, col 1) {
//   unexpected item = "$"
//   expected item(s) = Set("@", letter, digit, "#")
//   reasons = no reasons given
// }
// >!> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
// <!< (offset 0, line 1, col 1): Fail
// generated vanilla error (offset 0, line 1, col 1) {
//   unexpected item = "$"
//   expected item(s) = Set("!")
//   reasons = no reasons given
// }
// res4: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected "$"
//   expected "!", "#", "@", digit, or letter
//   >$
//    ^)

In the above example, you can see how each individual error is raised, as well as evidence of merging, and how errors can be turned into "hints" if the error is successfully recovered from: this means that the label may be re-incorporated into the error again later if they are at the valid offset, as seen in the errors for char('#').

Profiling Parser (Profiler)

The Profiler class, and the accompanying profile combinator, provide a rough guideline of how much of the runtime a parser might be taking up. The execution of each combinator is measured with a resolution of 100ns. First, a Profiler object must be set up and implicitly available in scope: its role is to collect the profiling samples. Then, a parser annotated with profile combinators is ran, and the results can be displayed with profiler.summary().

There is a disclaimer that the profiler "just provides data", no guarantee about its statistical significance is given. Multiple runs can be performed and these will be aggregated, the profiler can be cleared using clear().

import parsley.Parsley
import parsley.character.{string, char}
import parsley.combinator.traverse
import parsley.debug._

def classicString(s: String): Parsley[String] =
    traverse(char(_), s.toList: _*).map(_.mkString)

implicit val profiler: Profiler = new Profiler
// profiler: Profiler = ...
val strings = many(classicString("...").profile("classic string")
               <~> string("!!!").profile("optimised string"))
// strings: Parsley[List[(String, String)]] = ...
val stringsVoid = many(classicString("...").profile("voided classic string")
                   <~> string("!!!").profile("voided optimised string")).void
// stringsVoid: Parsley[Unit] = ...

strings.parse("...!!!" * 10000)
// res5: parsley.Result[String, List[(String, String)]] = ...
stringsVoid.parse("...!!!" * 10000)
// res6: parsley.Result[String, Unit] = ...

profiler.summary()
// name                    self time   num calls   average self time
// -----------------------------------------------------------------
// classic string          24087.5μs       10001             2.408μs
// voided optimised string   822.3μs       10000             0.082μs
// optimised string         7072.3μs       10000             0.707μs
// voided classic string    2492.5μs       10001             0.249μs
// -----------------------------------------------------------------

The above example shows that the string combinator is much faster than the "classic" definition in terms of traverse and char (not even accounting for its improved error messages!). However, it also shows that when the results are not required (as indicated by the void combinator, which aggressively suppresses result generation underneath it), the combinators perform much more similarly. You will, however, notice variance depending on when you visit this page: these results are generated each publish, and sometimes the non-voided string can outperform the voided one by pure chance!

The profile combinator will account for other profiled combinators underneath it, accounting for their "self time" only. This helps to measure the impact of a specific sub-parser more accurately.