Error Message Combinators

Aside from the failures generated by character consumption, parsley has many combinators for both generating failures unconditionally, as well as augenting existing errors with more information. These are found within the parsley.errors.combinator module.

The Scaladoc for this page can be found at parsley.errors.combinator.

Failure Combinators

Normally, failures can be generated by empty, satisfy, string, and notFollowedBy; as well as their derivatives. However, those do not capture the full variety of "unexpected" parts of error messages. In the below table, empty corresponds to empty(0) (these are both found in parsley.Parsley). The named items are produced by unexpected combinators, and wider carets of empty items can be obtained by passing wider values to empty. This is summarised in the table below.

Caret empty raw/eof named
0 empty(0) n/a unexpected(0, _)
1 empty(1) satisfy unexpected(1, _)
n empty(n) string unexpected(n, _)

The unexpected Combinator

The unexpected combinator fails immediately, but produces a given name as the unexpected component of the error message with a caret as wide as the given integer. For instance:

import parsley.character.char
import parsley.errors.combinator.unexpected

unexpected(3, "foo").parse("abcd")
// res0: parsley.Result[String, Nothing] = Failure((line 1, column 1):
//   unexpected foo
//   >abcd
//    ^^^)
(char('a') | unexpected("not an a")).parse("baa")
// res1: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected not an a
//   expected "a"
//   >baa
//    ^)

There are a few things to note about the above examples:

The fail Combinator

In contrast to the unexpected combinator, which produces vanilla errors, the fail combinator produces specialised errors, which suppress all other components of an error in favour of some specific messages.

import parsley.character.string
import parsley.errors.combinator.fail

fail(2, "msg1", "msg2", "msg3").parse("abc")
// res2: parsley.Result[String, Nothing] = Failure((line 1, column 1):
//   msg1
//   msg2
//   msg3
//   >abc
//    ^^)
(fail(1, "msg1") | fail(2, "msg2") | fail("msg3")).parse("abc")
// res3: parsley.Result[String, Nothing] = Failure((line 1, column 1):
//   msg1
//   msg2
//   msg3
//   >abc
//    ^^)
(fail("msg") | string("abc")).parse("xyz")
// res4: parsley.Result[String, String] = Failure((line 1, column 1):
//   msg
//   >xyz
//    ^^^)
(fail(1, "msg") | string("abc")).parse("xyz")
// res5: parsley.Result[String, String] = Failure((line 1, column 1):
//   msg
//   >xyz
//    ^)

Notice that if a caret width is specified, it will override any other carets from other combinators, like string. Not specifying a caret is adaptive. The fail combinator also suppressed other error messages, and merges within itself as if all the messages were generated by one fail.

Error Enrichment

Other than the freestanding combinators, some combinators are enabled by importing parsley.errors.combinator.ErrorMethods. Some of these are involved with augmenting error messages with additional information. These are discussed below.

None of the combinators in this section have any effect on fail or its derivatives.

The label Combinator

When combinators that read characters fail, they produce "expected" components in error messages:

import parsley.character.{char, string, satisfy}

char('a').parse("b")
// res6: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected "b"
//   expected "a"
//   >b
//    ^)
string("abc").parse("xyz")
// res7: parsley.Result[String, String] = Failure((line 1, column 1):
//   unexpected "xyz"
//   expected "abc"
//   >xyz
//    ^^^)
satisfy(_.isDigit).parse("a")
// res8: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected "a"
//   >a
//    ^)

Notice that the satisfy combinator cannot produce an expected item because nothing is known about the function passed in. The other two produce raw expected items. The label combinator can be used to replace these and generate named items. This is employed by parsley.character for its more specific parsers:

import parsley.errors.combinator.ErrorMethods

val digit = satisfy(_.isDigit).label("digit")
// digit: parsley.Parsley[Char] = parsley.Parsley@128f792a
digit.parse("a")
// res9: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected "a"
//   expected digit
//   >a
//    ^)

The label combinator above has added the label digit to the parser. If there was an existing label there, it would have been replaced.

A label combinator cannot be provided with "". In other libraries, this may represent hiding, however in parsley, the hide combinator is distinct.

A label combinator, along with other combinators, only applies if the error message properly lines up with the point the input was at when it entered the combinator - otherwise, the label may be inaccurate. For example:

val twoDigits = (digit *> digit).label("two digits")
// twoDigits: parsley.Parsley[Char] = parsley.Parsley@257c37c1
twoDigits.parse("a")
// res10: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected "a"
//   expected two digits
//   >a
//    ^)
twoDigits.parse("1a")
// res11: parsley.Result[String, Char] = Failure((line 1, column 2):
//   unexpected "a"
//   expected digit
//   >1a
//     ^)

The explain Combinator

The explain combinator allows for the addition of further lines of error message, providing more high-level reasons for the error or explanations about a syntactic construct. It behaves similarly to label in that it will only apply when the position of the error message matches the offset that the combinator entered at.

import parsley.errors.combinator.ErrorMethods

digit.explain("a digit is needed, for some reason").parse("a")
// res12: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected "a"
//   expected digit
//   a digit is needed, for some reason
//   >a
//    ^)

A explain combinator cannot be provided with "".

The hide Combinator

Sometimes, a parser should not appear in an error message. A good example is whitespace, which is almost never the solution to any parsing problem, and would otherwise distract from rest of the error content. The hide combinator can be used to suppress a parser from appearing in the rest of a message:

import parsley.errors.combinator.ErrorMethods

(char('a') | digit.hide).parse("b")
// res13: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected "b"
//   expected "a"
//   >b
//    ^)

Error Adjustment Combinators

The previous combinators in this page have been geared at adding additional richer information to the parse errors. However, these combinators are used to adjust the existing information, mostly relating to position, to ensure the error remains specific.

The amend Combinator

The amend combinator can adjust the position of an error message so that it occurs at an earlier position. This means that it can be affected by other combinators like label and explain. This is a precision tool, designed for fine-tuning error messages.

import parsley.errors.combinator.amend

amend(digit *> char('a')).parse("9b")
// res14: parsley.Result[String, Char] = Failure((line 1, column 1):
//   unexpected "9"
//   expected "a"
//   >9b
//    ^)

Notice that the above error makes no sense. This is why amend is a precision tool: it should ideally be used in conjunction with other combinators. For instance:

import parsley.syntax.character.charLift
import parsley.combinator.choice
import parsley.character.{noneOf, stringOfMany}

val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\')
val strLetter =
    noneOf('\"', '\\').label("string char") | ('\\' ~> escapeChar).label("escape char")
val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"'
strLit.parse("\"\\b\"")
// res15: parsley.Result[String, String] = Failure((line 1, column 3):
//   unexpected "b"
//   expected """, "\", "n", or "t"
//   >"\b"
//      ^)

In the above error, it is not entirely clear why the presented characters are expected. Perhaps it would be better to highlight a correct escape character instead? The amend combinator can be used in this case to pull the error back and rectify it:

val strLetter = noneOf('\"', '\\').label("string char") |
                amend('\\' ~> escapeChar).label("escape char")
strLit.parse("\"\\b\"")
// res16: parsley.Result[String, String] = Failure((line 1, column 2):
//   unexpected "\"
//   expected escape char or string char
//   >"\b"
//     ^)

While the amend has pulled the error back, and thanks to the label the error is still sensible, it could be improved by widening the caret and providing an explanation:

import parsley.Parsley.empty
val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\') | empty(2)
val strLetter = noneOf('\"', '\\').label("string char") |
                amend('\\' ~> escapeChar)
                  .label("escape char")
                  .explain("escape characters are \\n, \\t, \\\", or \\\\")
strLit.parse("\"\\b\"")
// res17: parsley.Result[String, String] = Failure((line 1, column 2):
//   unexpected "\b"
//   expected escape char or string char
//   escape characters are \n, \t, \", or \\
//   >"\b"
//     ^^)

Note, an unexpected could also have been used instead of empty to good effect.

The entrench and dislodge Combinators

The amend combinator will indiscriminately adjust error messages so thay they occur earlier. However, sometimes only errors from some parts of a parser should be repositioned. The entrench combinator protects errors from within its scope from being amended, and dislodge undoes that protection.

This can be useful if you want an error to be able to dominate another one, and then be amended afterwards, without affecting the original error. This normally has the following pattern:

val p = amendThenDislodge(1) {
    entrench(q) | r
}

In this example, we believe that r will produce errors deeper than qs, but after it discards qs message should be reset to an earier point. On the other hand, q is protected from the initial amendment, but then is free to be amended again after the dislodge has removed the protection.

The markAsToken Combinator

The markAsToken combinator will assign the "lexical" property to any error messages that happen within its scope at a deeper position than the combinator began at. This is fed forward onto the unexpectedToken method of the ErrorBuilder: more about this in lexical extraction.