Error Message Combinators
Aside from the failures generated by character consumption, parsley
has
many combinators for both generating failures unconditionally, as well as
augenting existing errors with more information. These are found within
the parsley.errors.combinator
module.
The Scaladoc for this page can be found at parsley.errors.combinator
.
Failure Combinators
Normally, failures can be generated by empty
, satisfy
, string
, and
notFollowedBy
; as well as their derivatives. However, those do not capture the
full variety of "unexpected" parts of error messages. In the below table, empty
corresponds to empty(0)
(these are both found in parsley.Parsley
). The
named items are produced by unexpected
combinators, and wider carets of
empty items can be obtained by passing wider values to empty
. This is summarised in the table below.
Caret | empty | raw/eof | named |
---|---|---|---|
0 |
empty(0) |
n/a | unexpected(0, _) |
1 |
empty(1) |
satisfy |
unexpected(1, _) |
n |
empty(n) |
string |
unexpected(n, _) |
The unexpected
Combinator
The unexpected
combinator fails immediately, but produces a given name as
the unexpected component of the error message with a caret as wide as the
given integer. For instance:
import parsley.character.char
import parsley.errors.combinator.unexpected
unexpected(3, "foo").parse("abcd")
// res0: parsley.Result[String, Nothing] = Failure((line 1, column 1):
// unexpected foo
// >abcd
// ^^^)
(char('a') | unexpected("not an a")).parse("baa")
// res1: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected not an a
// expected "a"
// >baa
// ^)
There are a few things to note about the above examples:
- Just using
unexpected
alone does not introduce any other components, like expected items, to the error - When the caret width is unspecified, it will adapt to whatever the caret would have been for the error message
- The named items resulting from the combinator dominate other kinds of
item, so that
char('a')
's natural "unexpected 'a'" disappears
The fail
Combinator
In contrast to the unexpected
combinator, which produces vanilla errors, the
fail
combinator produces specialised errors, which suppress all other
components of an error in favour of some specific messages.
import parsley.character.string
import parsley.errors.combinator.fail
fail(2, "msg1", "msg2", "msg3").parse("abc")
// res2: parsley.Result[String, Nothing] = Failure((line 1, column 1):
// msg1
// msg2
// msg3
// >abc
// ^^)
(fail(1, "msg1") | fail(2, "msg2") | fail("msg3")).parse("abc")
// res3: parsley.Result[String, Nothing] = Failure((line 1, column 1):
// msg1
// msg2
// msg3
// >abc
// ^^)
(fail("msg") | string("abc")).parse("xyz")
// res4: parsley.Result[String, String] = Failure((line 1, column 1):
// msg
// >xyz
// ^^^)
(fail(1, "msg") | string("abc")).parse("xyz")
// res5: parsley.Result[String, String] = Failure((line 1, column 1):
// msg
// >xyz
// ^)
Notice that if a caret width is specified, it will override any other
carets from other combinators, like string
. Not specifying a caret
is adaptive. The fail
combinator also suppressed other error messages,
and merges within itself as if all the messages were generated by one
fail
.
Error Enrichment
Other than the freestanding combinators, some combinators are enabled
by importing parsley.errors.combinator.ErrorMethods
. Some of these
are involved with augmenting error messages with additional information.
These are discussed below.
None of the combinators in this section have any effect on fail
or its
derivatives.
The label
Combinator
When combinators that read characters fail, they produce "expected" components in error messages:
import parsley.character.{char, string, satisfy}
char('a').parse("b")
// res6: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "b"
// expected "a"
// >b
// ^)
string("abc").parse("xyz")
// res7: parsley.Result[String, String] = Failure((line 1, column 1):
// unexpected "xyz"
// expected "abc"
// >xyz
// ^^^)
satisfy(_.isDigit).parse("a")
// res8: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "a"
// >a
// ^)
Notice that the satisfy
combinator cannot produce an expected item because
nothing is known about the function passed in. The other two produce raw
expected items. The label
combinator can be used to replace these and generate
named items. This is employed by parsley.character
for its more specific
parsers:
import parsley.errors.combinator.ErrorMethods
val digit = satisfy(_.isDigit).label("digit")
// digit: parsley.Parsley[Char] = parsley.Parsley@128f792a
digit.parse("a")
// res9: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "a"
// expected digit
// >a
// ^)
The label
combinator above has added the label digit
to the parser. If
there was an existing label there, it would have been replaced.
A label
combinator cannot be provided with ""
. In other libraries, this may
represent hiding, however in parsley
, the hide
combinator is distinct.
A label
combinator, along with other combinators, only applies if the
error message properly lines up with the point the input was at when it
entered the combinator - otherwise, the label may be inaccurate. For example:
val twoDigits = (digit *> digit).label("two digits")
// twoDigits: parsley.Parsley[Char] = parsley.Parsley@257c37c1
twoDigits.parse("a")
// res10: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "a"
// expected two digits
// >a
// ^)
twoDigits.parse("1a")
// res11: parsley.Result[String, Char] = Failure((line 1, column 2):
// unexpected "a"
// expected digit
// >1a
// ^)
The explain
Combinator
The explain
combinator allows for the addition of further lines of error
message, providing more high-level reasons for the error or explanations about
a syntactic construct. It behaves similarly to label
in that it will only
apply when the position of the error message matches the offset that the combinator entered at.
import parsley.errors.combinator.ErrorMethods
digit.explain("a digit is needed, for some reason").parse("a")
// res12: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "a"
// expected digit
// a digit is needed, for some reason
// >a
// ^)
A explain
combinator cannot be provided with ""
.
The hide
Combinator
Sometimes, a parser should not appear in an error message. A good example is
whitespace, which is almost never the solution to any parsing problem, and
would otherwise distract from rest of the error content. The hide
combinator
can be used to suppress a parser from appearing in the rest of a message:
import parsley.errors.combinator.ErrorMethods
(char('a') | digit.hide).parse("b")
// res13: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "b"
// expected "a"
// >b
// ^)
Error Adjustment Combinators
The previous combinators in this page have been geared at adding additional richer information to the parse errors. However, these combinators are used to adjust the existing information, mostly relating to position, to ensure the error remains specific.
The amend
Combinator
The amend
combinator can adjust the position of an error message so that it
occurs at an earlier position. This means that it can be affected by other
combinators like label
and explain
. This is a precision tool, designed
for fine-tuning error messages.
import parsley.errors.combinator.amend
amend(digit *> char('a')).parse("9b")
// res14: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "9"
// expected "a"
// >9b
// ^)
Notice that the above error makes no sense. This is why amend
is a precision
tool: it should ideally be used in conjunction with other combinators. For instance:
import parsley.syntax.character.charLift
import parsley.combinator.choice
import parsley.character.{noneOf, stringOfMany}
val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\')
val strLetter =
noneOf('\"', '\\').label("string char") | ('\\' ~> escapeChar).label("escape char")
val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"'
strLit.parse("\"\\b\"")
// res15: parsley.Result[String, String] = Failure((line 1, column 3):
// unexpected "b"
// expected """, "\", "n", or "t"
// >"\b"
// ^)
In the above error, it is not entirely clear why the presented characters
are expected. Perhaps it would be better to highlight a correct escape
character instead? The amend
combinator can be used in this case to pull
the error back and rectify it:
val strLetter = noneOf('\"', '\\').label("string char") |
amend('\\' ~> escapeChar).label("escape char")
strLit.parse("\"\\b\"")
// res16: parsley.Result[String, String] = Failure((line 1, column 2):
// unexpected "\"
// expected escape char or string char
// >"\b"
// ^)
While the amend
has pulled the error back, and thanks to the label
the
error is still sensible, it could be improved by widening the caret and
providing an explanation:
import parsley.Parsley.empty
val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\') | empty(2)
val strLetter = noneOf('\"', '\\').label("string char") |
amend('\\' ~> escapeChar)
.label("escape char")
.explain("escape characters are \\n, \\t, \\\", or \\\\")
strLit.parse("\"\\b\"")
// res17: parsley.Result[String, String] = Failure((line 1, column 2):
// unexpected "\b"
// expected escape char or string char
// escape characters are \n, \t, \", or \\
// >"\b"
// ^^)
Note, an unexpected
could also have been used instead of empty
to good effect.
The entrench
and dislodge
Combinators
The amend
combinator will indiscriminately adjust error messages
so thay they occur earlier. However, sometimes only errors from some
parts of a parser should be repositioned. The entrench
combinator
protects errors from within its scope from being amended, and
dislodge
undoes that protection.
This can be useful if you want an error to be able to dominate another one, and then be amended afterwards, without affecting the original error. This normally has the following pattern:
val p = amendThenDislodge(1) {
entrench(q) | r
}
In this example, we believe that r
will produce errors deeper than q
s, but after it discards
q
s message should be reset to an earier point. On the other hand, q
is protected from the initial
amendment, but then is free to be amended again after the dislodge
has removed the protection.
The markAsToken
Combinator
The markAsToken
combinator will assign the "lexical" property to any error messages that happen within its scope at a deeper position than the combinator
began at. This is fed forward onto the unexpectedToken
method of the ErrorBuilder
: more about this in lexical extraction.