Basic Debug Combinators (parsley.debug
)
Parsley has a collection of basic debugging utilities found within parsley.debug
. These can help
debug errant parsers, understand how error messages have been generated, and provide a rough sense
of what parsers take the most time to execute.
The combinators themselves are all contained within parsley.debug.DebugCombinators
.
The Scaladoc for this page can be found at parsley.debug
and parsley.debug.DebugCombinators
For a more comprehensive debugging system, check out the parsley-debug
library, which has its own
section on the navigation bar.
Debugging Problematic Parsers (debug
)
The most common quick debugging combinator is debug
, which at its simplest prints some information
on entering and exiting a combinator:
import parsley.Parsley.atomic
import parsley.character.string
import parsley.debug, debug._
val hello = ( atomic(string("hello").debug("hello")).debug("atomic1")
| string("hey").debug("hey")
| string("hi").debug("hi")
)
// hello: parsley.Parsley[String] = parsley.Parsley@13ef6ea
debug.disableColorRendering()
hello.parse("hey")
// >atomic1> (1, 1): hey•
// ^
// >hello> (1, 1): hey•
// ^
// <hello< (1, 3): hey• Fail
// ^
// <atomic1< (1, 1): hey• Fail
// ^
// >hey> (1, 1): hey•
// ^
// <hey< (1, 4): hey• Good
// ^
// res1: parsley.Result[String, String] = Success(hey)
hello.parse("hi")
// >atomic1> (1, 1): hi•
// ^
// >hello> (1, 1): hi•
// ^
// <hello< (1, 2): hi• Fail
// ^
// <atomic1< (1, 1): hi• Fail
// ^
// >hey> (1, 1): hi•
// ^
// <hey< (1, 2): hi• Fail
// ^
// res2: parsley.Result[String, String] = Failure((line 1, column 1):
// unexpected "hi"
// expected "hello" or "hey"
// >hi
// ^^)
In the above example, an unexpected failure to parse the input "hi"
is being debugged using
debug
. Each of the string
combinators are annotated, as well as the atomic
. This allows
us to see the control flow of the parser as it executes, as well as where the input was read up to.
In this case, we can see that atomic
has undone input consumption, but that doesn't apply to hey
.
In other words, there is another atomic
missing! We could have added debug
to the |
combinator
as well to make it even clearer, of course. Though not visible above, the output is usually coloured.
If this causes problems, debug.disableColorRendering()
will disable it, or the colored
parameter
can be set to false
on the combinator.
Breakpoints
The Breakpoint
type has values EntryBreak
, ExitBreak
, FullBreak
, and NoBreak
, which is the
default. FullBreak
has the effect of both EntryBreak
and ExitBreak
combined: EntryBreak
will
pause execution on the entry to the combinator, requiring input on the console to proceed, and ExitBreak
will do the same during the exit.
Watching References
The debug
combinator takes a variadic number of reference/name pairs as its last argument. These
allow you to watch the values stored in references as well during the debugging process. For instance:
import parsley.Parsley.atomic
import parsley.state._
import parsley.character.string
import parsley.debug._
val p = 0.makeRef { r1 =>
false.makeRef { r2 =>
val p = ( string("hello")
~> r1.update(_ + 5)
~> ( string("!")
| r2.set(true) ~> string("?")
).debug("punctuation", r2 -> "r2")
)
r1.rollback(atomic(p).debug("hello!", r1 -> "r1", r2 -> "r2"))
.debug("rollback", r1 -> "r1")
}
}
// p: parsley.Parsley[String] = parsley.Parsley@76e47a91
p.parse("hello world")
// >rollback> (1, 1): hello·
// ^
// watched registers:
// r1 = 0
//
// >hello!> (1, 1): hello·
// ^
// watched registers:
// r1 = 0
// r2 = false
//
// >punctuation> (1, 6): hello·world•
// ^
// watched registers:
// r2 = false
//
// <punctuation< (1, 6): hello·world• Fail
// ^
// watched registers:
// r2 = true
//
// <hello!< (1, 1): hello· Fail
// ^
// watched registers:
// r1 = 5
// r2 = true
//
// <rollback< (1, 1): hello· Fail
// ^
// watched registers:
// r1 = 0
//
// res3: parsley.Result[String, String] = Failure((line 1, column 6):
// unexpected space
// expected "!" or "?"
// >hello world
// ^)
Debugging Error Messages (debugError
)
The debugError
is a slightly more experimental combinator that aims to provide some (lower-level)
insight into how an error message came to be. For instance:
import parsley.character.{letter, digit, char}
import parsley.Parsley.many
import parsley.debug._
val q = (many( ( digit.debugError("digit")
| letter.debugError("letter")
).debugError("letter or digit")
).debugError("many letterOrDigit")
~> many(char('@').debugError("@")).debugError("many @")
~> char('#').debugError("#")) | char('!').debugError("!")
// q: parsley.Parsley[Char] = parsley.Parsley@8ef90f1
q.parse("$")
// >many letterOrDigit> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
// >letter or digit> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
// >digit> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
// <digit< (offset 0, line 1, col 1): Fail
// generated vanilla error (offset 0, line 1, col 1) {
// unexpected item = "$"
// expected item(s) = Set(digit)
// reasons = no reasons given
// }
// >letter> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
// <letter< (offset 0, line 1, col 1): Fail
// generated vanilla error (offset 0, line 1, col 1) {
// unexpected item = "$"
// expected item(s) = Set(letter)
// reasons = no reasons given
// }
// <letter or digit< (offset 0, line 1, col 1): Fail
// generated vanilla error (offset 0, line 1, col 1) {
// unexpected item = "$"
// expected item(s) = Set(letter, digit)
// reasons = no reasons given
// }
// <many letterOrDigit< (offset 0, line 1, col 1): Good, current hints are Set(letter, digit) with all added since entry to debug (valid at offset 0)
// >many @> (offset 0, line 1, col 1): current hints are Set(letter, digit) (valid at offset 0)
// >@> (offset 0, line 1, col 1): current hints are Set(letter, digit) (valid at offset 0)
// <@< (offset 0, line 1, col 1): Fail
// generated vanilla error (offset 0, line 1, col 1) {
// unexpected item = "$"
// expected item(s) = Set("@", letter, digit)
// reasons = no reasons given
// }
// <many @< (offset 0, line 1, col 1): Good, current hints are Set("@", letter, digit) with Set("@") added since entry to debug (valid at offset 0)
// >#> (offset 0, line 1, col 1): current hints are Set("@", letter, digit) (valid at offset 0)
// <#< (offset 0, line 1, col 1): Fail
// generated vanilla error (offset 0, line 1, col 1) {
// unexpected item = "$"
// expected item(s) = Set("@", letter, digit, "#")
// reasons = no reasons given
// }
// >!> (offset 0, line 1, col 1): current hints are Set() (valid at offset 0)
// <!< (offset 0, line 1, col 1): Fail
// generated vanilla error (offset 0, line 1, col 1) {
// unexpected item = "$"
// expected item(s) = Set("!")
// reasons = no reasons given
// }
// res4: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "$"
// expected "!", "#", "@", digit, or letter
// >$
// ^)
In the above example, you can see how each individual error is raised, as well as evidence of merging,
and how errors can be turned into "hints" if the error is successfully recovered from: this means
that the label may be re-incorporated into the error again later if they are at the valid offset,
as seen in the errors for char('#')
.
Profiling Parser (Profiler
)
The Profiler
class, and the accompanying profile
combinator, provide a rough guideline of how
much of the runtime a parser might be taking up. The execution of each combinator is measured with
a resolution of 100ns. First, a Profiler
object must be set up and implicitly available in scope:
its role is to collect the profiling samples. Then, a parser annotated with profile
combinators
is ran, and the results can be displayed with profiler.summary()
.
There is a disclaimer that the profiler "just provides data", no guarantee about its statistical
significance is given. Multiple runs can be performed and these will be aggregated, the profiler
can be cleared using clear()
.
import parsley.Parsley, Parsley.pure
import parsley.character.{string, char}
import parsley.combinator.traverse
import parsley.debug._
def classicString(s: String): Parsley[String] = s.toList match {
case Nil => pure("")
case c :: cs => traverse(c, cs: _*)(char).map(_.mkString)
}
implicit val profiler: Profiler = new Profiler
// profiler: Profiler = ...
val strings = many(classicString("...").profile("classic string")
<~> string("!!!").profile("optimised string"))
// strings: Parsley[List[(String, String)]] = ...
val stringsVoid = many(classicString("...").profile("voided classic string")
<~> string("!!!").profile("voided optimised string")).void
// stringsVoid: Parsley[Unit] = ...
strings.parse("...!!!" * 10000)
// res5: parsley.Result[String, List[(String, String)]] = ...
stringsVoid.parse("...!!!" * 10000)
// res6: parsley.Result[String, Unit] = ...
profiler.summary()
// name self time num calls average self time
// -----------------------------------------------------------------
// classic string 3017092.3μs 10001 301.679μs
// voided optimised string 888.9μs 10000 0.088μs
// optimised string 9626.0μs 10000 0.962μs
// voided classic string 3603.4μs 10001 0.360μs
// -----------------------------------------------------------------
The above example shows that the string
combinator is much faster than the "classic" definition
in terms of traverse
and char
(not even accounting for its improved error messages!). However,
it also shows that when the results are not required (as indicated by the void
combinator, which
aggressively suppresses result generation underneath it), the combinators perform much more similarly.
You will, however, notice variance depending on when you visit this page: these results are generated
each publish, and sometimes the non-voided string
can outperform the voided one by pure chance!
The profile
combinator will account for other profiled combinators underneath it, accounting for
their "self time" only. This helps to measure the impact of a specific sub-parser more accurately.