Constructing Custom Errors
By default, parsley
returns errors that consist of String
-based
content. However, it is possible to build error messages into a
datatype or format that is user-defined. This is done with the
ErrorBuilder
typeclass.
The ErrorBuilder
is pulled in implicitly by the parse
method of
the Parsley
type:
class Parsley[A] {
def parse[Err: ErrorBuilder](input: String): Result[Err, A]
}
This is equivalent to having an implicit parameter of type
ErrorBuilder[Err]
. As the ErrorBuilder
companion object has an
implicit value of type ErrorBuilder[String]
only, the type
String
is chosen as the default instantiation of Err
by Scala.
Providing another ErrorBuilder
implicit object in a tighter
scope (or adding an explicit type ascription with another implicit
object available), you are able to hook in your own type instead.
This page describes how the ErrorBuilder
is structured, and gives
an example of how to construct a lossy type suitable for unit
testing generated error messages.
The Scaladoc for this page can be found at parsley.errors.ErrorBuilder
.
Error Message Structure
Error messages within parsley
take two different forms: vanilla
or specialised. The error chosen depends on the combinators used
to produce it: empty
, unexpected
, char
, string
, etc all produce vanilla errors; and fail
and its derivatives produce
specialised errors. An ErrorBuilder
must describe how to format
both kinds of error; their structure is explained below.
Vanilla Errors
┌───────────────────────────────────────────────────────────────────────┐
│ ┌────────────────┐◄──────── position │
│ source │ │ │
│ │ │ line col│ │
│ ▼ │ │ ││ │
│ ┌─────┐ │ ▼ ▼│ end of input │
│ In foo.txt (line 1, column 5): │ │
│ ┌─────────────────────┐ │ │
│unexpected ─────►│ │ │ ┌───── expected │
│ │ ┌──────────┐ ◄──────────┘ │ │
│ unexpected end of input ▼ │
│ ┌──────────────────────────────────────┐ │
│ expected "(", "negate", digit, or letter │
│ │ └──────┘ └───┘ └────┘ ◄────── named│
│ │ ▲ └──────────┘ │ │
│ │ │ │ │
│ │ raw │ │
│ └─────────────────┬───────────┘ │
│ '-' is a binary operator │ │
│ └──────────────────────┘ │ │
│ ┌──────┐ ▲ │ │
│ │>3+4- │ │ expected items │
│ │ ^│ │ │
│ └──────┘ └───────────────── reason │
│ ▲ │
│ │ │
│ line info │
└───────────────────────────────────────────────────────────────────────┘
A vanilla error consists of three unique components: the
unexpected
component, which describes the problematic token;
the expected
component, which describes the possible parses that
would have avoided the error; and the reason
component, which
gives additional context for an error. These are in addition to
parts shared with specialised errors: the source
, position
,
and the lineInfo
. Any of the three unique components may be
missing from the error, and the ErrorBuilder
will need to be able
to handle this.
Within both unexpected
and expected
, items can have one of three forms: named
, which indicates they came from labels;
raw
, which means they came directly from the input itself; and
endOfInput
, which means no more input was available. All three
of these states can be formatted independently.
Specialised Errors
┌───────────────────────────────────────────────────────────────────────┐
│ ┌────────────────┐◄──────── position │
│ source │ │ │
│ │ │ line col │
│ ▼ │ │ │ │
│ ┌─────┐ │ ▼ ▼ │
│ In foo.txt (line 1, column 5): │
│ │
│ ┌───► something went wrong │
│ │ │
│ message ──┼───► it looks like a binary operator has no argument │
│ │ │
│ └───► '-' is a binary operator │
│ ┌──────┐ │
│ │>3+4- │ │
│ │ ^│ │
│ └──────┘ │
│ ▲ │
│ │ │
│ line info │
└───────────────────────────────────────────────────────────────────────┘
In contrast to the vanilla error, specialised errors have one
unique component, messages
, which is zero or more lines of
bespoke error messages generated by fail
combinators.
The ErrorBuilder
Typeclass
Within the ErrorBuilder
trait, there is a number of undefined
type aliases. Each of these must be implemented by an extender and
provide an internal type to represent different components within
the system. These are used to ensure maximal flexiblity of the
user to pick how each component should be represented without
exposing unnecessary information into the rest of the system.
After these types are specified, the methods of the typeclass can be implemented. These put together the primtive-most components and compose them into the larger whole. The documentation of the typeclass details the role of these well enough, however. For example's sake, however, these are the two shapes of call that will be made for the different types of error messages:
Vanilla
(line 1, column 5):
unexpected end of input
expected "(", "negate", digit, or letter
'-' is a binary operator
>3+4-
^
val builder = implicitly[ErrorBuilder[String]]
builder.format (
builder.pos(1, 5),
builder.source(None),
builder.vanillaError (
builder.unexpected(Some(builder.endOfInput)),
builder.expected (
builder.combineExpectedItems(Set (
builder.raw("("),
builder.raw("negate"),
builder.named("digit"),
builder.named("letter")
))
),
builder.combineMessages(List(
builder.reason("'-' is a binary operator")
)),
builder.lineInfo("3+4-", Nil, Nil, 4, 4)
)
)
One builder call not shown here, is a call to
builder.unexpectedToken
. This is a bigger discussion and is
deferred to Token Extraction in ErrorBuilder
Specialised
In file 'foo.txt' (line 2, column 6):
first message
second message
>first line of input
>second line
^^^
>third line
>fourth line
val builder = implicitly[ErrorBuilder[String]]
builder.format (
builder.pos(2, 6),
builder.source(Some("foo.txt")),
builder.specialisedError (
builder.combineMessages(List(
builder.message("first message"),
builder.message("second message"),
)),
builder.lineInfo("second line",
List("first line of input"),
List("third line", "fourth line"),
5,
3)
)
)
Constructing Test Errors
As an example of how to construct an ErrorBuilder
for a type,
consider the following representation of TestError
:
case class TestError(pos: (Int, Int), lines: TestErrorLines)
sealed trait TestErrorLines
case class VanillaError(
unexpected: Option[TestErrorItem],
expecteds: Set[TestErrorItem],
reasons: Set[String]) extends TestErrorLines
case class SpecialisedError(msgs: Set[String]) extends TestErrorLines
sealed trait TestErrorItem
case class TestRaw(item: String) extends TestErrorItem
case class TestNamed(item: String) extends TestErrorItem
case object TestEndOfInput extends TestErrorItem
This type, as will become evident from the formatter derived
from it, is lossy and does not perfectly encode all the
information available. Notice that TestErrorItem
is a supertype
of TestRaw
, TestNamed
, and TestEndOfInput
: this is
required, as the representation of each must all share a common
supertype.
To construct an ErrorBuilder[TestError]
, the type aliases must
first be filled in:
class TestErrorBuilder extends ErrorBuilder[TestError] {
type Position = (Int, Int)
type Source = Unit
type ErrorInfoLines = TestErrorLines
type Item = TestErrorItem
type Raw = TestRaw
type Named = TestNamed
type EndOfInput = TestEndOfInput.type
type Message = String
type Messages = Set[String]
type ExpectedItems = Set[TestErrorItem]
type ExpectedLine = Set[TestErrorItem]
type UnexpectedLine = Option[TestErrorItem]
type LineInfo = Unit
//...
}
These types can be determined by examining the shape of TestError
: for bits that it doesn't work, these are set to Unit
. With these in place, the refined types of the typeclass
methods make it very easy to fill in the gaps:
class TestErrorBuilder extends ErrorBuilder[TestError] {
//...
def format(pos: (Int, Int), source: Unit,
lines: TestErrorLines): TestError = TestError(pos, lines)
def vanillaError(
unexpected: Option[TestErrorItem],
expected: Set[TestErrorItem],
reasons: Set[String],
line: Unit
): TestErrorLines = VanillaError(unexpected, expected, reasons)
def specialisedError(
msgs: Set[String],
line: Unit
): TestErrorLines = SpecialisedError(msgs)
def pos(line: Int, col: Int): (Int, Int) = (line, col)
def source(sourceName: Option[String]): Unit = ()
def combineExpectedItems(alts: Set[TestErrorItem]): Set[TestErrorItem] = alts
def combineMessages(alts: Seq[String]): Set[String] = alts.toSet
def unexpected(item: Option[TestErrorItem]): Option[TestErrorItem] = item
def expected(alts: Set[TestErrorItem]): Set[TestErrorItem] = alts
def message(msg: String): String = msg
def reason(msg: String): String = msg
def raw(item: String): TestRaw = TestRaw(item)
def named(item: String): TestNamed = TestNamed(item)
val endOfInput: TestEndOfInput.type = TestEndOfInput
val numLinesAfter: Int = 0
val numLinesBefore: Int = 0
def lineInfo(
line: String,
linesBefore: Seq[String],
linesAfter: Seq[String],
errorPointsAt: Int, errorWidth: Int
): Unit = ()
// The implementation of this is usually provided by a mixed-in
// token extractor, discussed in `tokenextractors`
def unexpectedToken(
cs: Iterable[Char],
amountOfInputParserWanted: Int,
lexicalError: Boolean
): Token = ???
}
Each of the methods above do the bare minimum work to
satisfy the types. As noted in the comment, the implementation
of unexpectedToken
is usually done by mixing in a
token extractor, which is explained here.