Error Message Combinators
Aside from the failures generated by character consumption, parsley has
many combinators for both generating failures unconditionally, as well as
augmenting existing errors with more information. These are found within
the parsley.errors.combinator module.
The Scaladoc for this page can be found at parsley.errors.combinator.
Failure Combinators
Normally, failures can be generated by empty, satisfy, string, and
notFollowedBy; as well as their derivatives. However, those do not capture the
full variety of "unexpected" parts of error messages. In the below table, empty
corresponds to empty(0) (these are both found in parsley.Parsley). The
named items are produced by unexpected combinators, and wider carets of
empty items can be obtained by passing wider values to empty. This is summarised in the table below.
| Caret | empty | raw/eof | named |
|---|---|---|---|
0 |
empty(0) |
n/a | unexpected(0, _) |
1 |
empty(1) |
satisfy |
unexpected(1, _) |
n |
empty(n) |
string |
unexpected(n, _) |
The unexpected Combinator
The unexpected combinator fails immediately, but produces a given name as
the unexpected component of the error message with a caret as wide as the
given integer. For instance:
import parsley.character.char
import parsley.errors.combinator.unexpected
unexpected(3, "foo").parse("abcd")
// res0: parsley.Result[String, Nothing] = Failure((line 1, column 1):
// unexpected foo
// >abcd
// ^^^)
(char('a') | unexpected("not an a")).parse("baa")
// res1: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected not an a
// expected "a"
// >baa
// ^)
There are a few things to note about the above examples:
- Just using
unexpectedalone does not introduce any other components, like expected items, to the error - When the caret width is unspecified, it will adapt to whatever the caret would have been for the error message
- The named items resulting from the combinator dominate other kinds of
item, so that
char('a')'s natural "unexpected 'a'" disappears
The fail Combinator
In contrast to the unexpected combinator, which produces vanilla errors, the
fail combinator produces specialised errors, which suppress all other
components of an error in favour of some specific messages.
import parsley.character.string
import parsley.errors.combinator.fail
fail(2, "msg1", "msg2", "msg3").parse("abc")
// res2: parsley.Result[String, Nothing] = Failure((line 1, column 1):
// msg1
// msg2
// msg3
// >abc
// ^^)
(fail(1, "msg1") | fail(2, "msg2") | fail("msg3")).parse("abc")
// res3: parsley.Result[String, Nothing] = Failure((line 1, column 1):
// msg1
// msg2
// msg3
// >abc
// ^^)
(fail("msg") | string("abc")).parse("xyz")
// res4: parsley.Result[String, String] = Failure((line 1, column 1):
// msg
// >xyz
// ^^^)
(fail(1, "msg") | string("abc")).parse("xyz")
// res5: parsley.Result[String, String] = Failure((line 1, column 1):
// msg
// >xyz
// ^)
Notice that if a caret width is specified, it will override any other
carets from other combinators, like string. Not specifying a caret
is adaptive. The fail combinator also suppressed other error messages,
and merges within itself as if all the messages were generated by one
fail.
Error Enrichment
Other than the freestanding combinators, some combinators are enabled
by importing parsley.errors.combinator.ErrorMethods. Some of these
are involved with augmenting error messages with additional information.
These are discussed below.
None of the combinators in this section have any effect on fail or its
derivatives.
The label Combinator
When combinators that read characters fail, they produce "expected" components in error messages:
import parsley.character.{char, string, satisfy}
char('a').parse("b")
// res6: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "b"
// expected "a"
// >b
// ^)
string("abc").parse("xyz")
// res7: parsley.Result[String, String] = Failure((line 1, column 1):
// unexpected "xyz"
// expected "abc"
// >xyz
// ^^^)
satisfy(_.isDigit).parse("a")
// res8: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "a"
// >a
// ^)
Notice that the satisfy combinator cannot produce an expected item because
nothing is known about the function passed in. The other two produce raw
expected items. The label combinator can be used to replace these and generate
named items. This is employed by parsley.character for its more specific
parsers:
import parsley.errors.combinator.ErrorMethods
val digit = satisfy(_.isDigit).label("digit")
// digit: parsley.Parsley[Char] = parsley.Parsley@6ec41c4
digit.parse("a")
// res9: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "a"
// expected digit
// >a
// ^)
The label combinator above has added the label digit to the parser. If
there was an existing label there, it would have been replaced.
A label combinator cannot be provided with "". In other libraries, this may
represent hiding, however in parsley, the hide combinator is distinct.
A label combinator, along with other combinators, only applies if the
error message properly lines up with the point the input was at when it
entered the combinator - otherwise, the label may be inaccurate. For example:
val twoDigits = (digit *> digit).label("two digits")
// twoDigits: parsley.Parsley[Char] = parsley.Parsley@2295d73c
twoDigits.parse("a")
// res10: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "a"
// expected two digits
// >a
// ^)
twoDigits.parse("1a")
// res11: parsley.Result[String, Char] = Failure((line 1, column 2):
// unexpected "a"
// expected digit
// >1a
// ^)
The explain Combinator
The explain combinator allows for the addition of further lines of error
message, providing more high-level reasons for the error or explanations about
a syntactic construct. It behaves similarly to label in that it will only
apply when the position of the error message matches the offset that the combinator entered at.
import parsley.errors.combinator.ErrorMethods
digit.explain("a digit is needed, for some reason").parse("a")
// res12: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "a"
// expected digit
// a digit is needed, for some reason
// >a
// ^)
A explain combinator cannot be provided with "".
The hide Combinator
Sometimes, a parser should not appear in an error message. A good example is
whitespace, which is almost never the solution to any parsing problem, and
would otherwise distract from rest of the error content. The hide combinator
can be used to suppress a parser from appearing in the rest of a message:
import parsley.errors.combinator.ErrorMethods
(char('a') | digit.hide).parse("b")
// res13: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "b"
// expected "a"
// >b
// ^)
Error Adjustment Combinators
The previous combinators in this page have been geared at adding additional richer information to the parse errors. However, these combinators are used to adjust the existing information, mostly relating to position, to ensure the error remains specific.
The amend Combinator
The amend combinator can adjust the position of an error message so that it
occurs at an earlier position. This means that it can be affected by other
combinators like label and explain. This is a precision tool, designed
for fine-tuning error messages.
import parsley.errors.combinator.amend
amend(digit *> char('a')).parse("9b")
// res14: parsley.Result[String, Char] = Failure((line 1, column 1):
// unexpected "9"
// expected "a"
// >9b
// ^)
Notice that the above error makes no sense. This is why amend is a precision
tool: it should ideally be used in conjunction with other combinators. For instance:
import parsley.syntax.character.charLift
import parsley.combinator.choice
import parsley.character.{noneOf, stringOfMany}
val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\')
val strLetter =
noneOf('\"', '\\').label("string char") | ('\\' ~> escapeChar).label("escape char")
val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"'
strLit.parse("\"\\b\"")
// res15: parsley.Result[String, String] = Failure((line 1, column 3):
// unexpected "b"
// expected """, "\", "n", or "t"
// >"\b"
// ^)
In the above error, it is not entirely clear why the presented characters
are expected. Perhaps it would be better to highlight a correct escape
character instead? The amend combinator can be used in this case to pull
the error back and rectify it:
val strLetter = noneOf('\"', '\\').label("string char") |
amend('\\' ~> escapeChar).label("escape char")
strLit.parse("\"\\b\"")
// res16: parsley.Result[String, String] = Failure((line 1, column 2):
// unexpected "\"
// expected escape char or string char
// >"\b"
// ^)
While the amend has pulled the error back, and thanks to the label the
error is still sensible, it could be improved by widening the caret and
providing an explanation:
import parsley.Parsley.empty
val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\') | empty(2)
val strLetter = noneOf('\"', '\\').label("string char") |
amend('\\' ~> escapeChar)
.label("escape char")
.explain("escape characters are \\n, \\t, \\\", or \\\\")
strLit.parse("\"\\b\"")
// res17: parsley.Result[String, String] = Failure((line 1, column 2):
// unexpected "\b"
// expected escape char or string char
// escape characters are \n, \t, \", or \\
// >"\b"
// ^^)
Note, an unexpected could also have been used instead of empty to good effect.
The entrench and dislodge Combinators
The amend combinator will indiscriminately adjust error messages
so thay they occur earlier. However, sometimes only errors from some
parts of a parser should be repositioned. The entrench combinator
protects errors from within its scope from being amended, and
dislodge undoes that protection.
This can be useful if you want an error to be able to dominate another one, and then be amended afterwards, without affecting the original error. This normally has the following pattern:
val p = amendThenDislodge(1) {
entrench(q) | r
}
In this example, we believe that r will produce errors deeper than qs, but after it discards
qs message should be reset to an earier point. On the other hand, q is protected from the initial
amendment, but then is free to be amended again after the dislodge has removed the protection.
The markAsToken Combinator
The markAsToken combinator will assign the "lexical" property to any error messages that happen within its scope at a deeper position than the combinator
began at. This is fed forward onto the unexpectedToken method of the ErrorBuilder: more about this in lexical extraction.