Configuring Errors (parsley.token.errors
)
The default error messages generated by the parsers produced by
the Lexer
are ok, but can be much improved.
errors.ErrorConfig
The ErrorConfig
class is where all the
configuration for error messages generated by the Lexer
resides. Everything in
this class will have a default implementation (nothing is abstract); this ensures
easy backwards compatibility. Each of the configurations
inside takes one of the following forms:
- A plain
String
argument, usually indicating a name of a compulsory label. - A
LabelConfig
, which can either be unconfigured, hidden, or a regular label name. - A
LabelWithExplainConfig
, which augments the previous configuration to also allow for a reason to be added, if desired. - A
FilterConfig
, or one of its specific subtypes, which can be used to handle the messages for ill-conforming data. - A special configuration, which is used for very specific error messages, usually arising from one of the more advanced error patterns (see Advanced Error Messages)
Configuring Labels and Explains
Labels are one of the most common additional error configurations that can be applied throughout the pre-made lexer parsers. Some, but not all, of these labels can be configured to also produce a reason if the configuree cannot be parsed (either for why it should be there or what it requires). The hierarchy of components is visualised by the following UML diagram:
classDiagram LabelWithExplainConfig --|> LabelConfig LabelWithExplainConfig --|> ExplainConfig LabelConfig --|> Label LabelConfig --|> Hidden ExplainConfig --|> Reason LabelConfig --|> NotConfigured LabelWithExplainConfig --|> NotConfigured ExplainConfig --|> NotConfigured LabelWithExplainConfig --|> LabelAndReason class LabelWithExplainConfig { <<trait>> } class LabelConfig { <<trait>> } class ExplainConfig { <<trait>> } class Label { <<class>> label: Seq[String]* } class Hidden { <<object>> } class NotConfigured { <<object>> } class Reason { <<class>> reason: String* } class LabelAndReason { <<class>> label: String* reason: String* }
Broadly, a component may either be marked as a LabelWithExplainConfig
, which
means it can contain either labels, reasons, or both; LabelConfig
if a reason
wouldn't make sense; and a ReasonConfig
if it does not make sense to name.
Configuring Labels
Adding a label can be one of the following:
Label
: this labels the corresponding parser with one or more labels -- this also applies forLabelAndReason
.Hidden
: this suppresses any error messages arising from the corresponding parser.NotConfigured
: this doesn't alter the error messages from the corresponding parser.
Adding Explanations
Adding an explanation can be one of the following:
Reason
: this adds a reason for the corresponding parser though doesn't change the labelling -- unlessLabelAndReason
is used instead.NotConfigured
: this doesn't alter the error messages from the corresponding parser.
Configuring Filtering
Some parsers perform filtering on their results, for instance checking if a numeric literal
is within a certain bit-width. The messages generated when these filters fail is controled
by the FilterConfig[A]
, where A
is the type of value being filtered. The below diagram
shows how the various sub-configurations are laid out.
classDiagram FilterConfig~A~ --|> VanillaFilterConfig~A~ FilterConfig~A~ --|> SpecialisedFilterConfig~A~ VanillaFilterConfig~A~ --|> Because~A~ VanillaFilterConfig~A~ --|> Unexpected~A~ VanillaFilterConfig~A~ --|> UnexpectedBecause~A~ VanillaFilterConfig~A~ --|> BasicFilter~A~ SpecialisedFilterConfig~A~ --|> BasicFilter~A~ SpecialisedFilterConfig~A~ --|> SpecialisedMessage~A~ class FilterConfig~A~ { <<trait>> } class VanillaFilterConfig~A~ { <<trait>> } class SpecialisedFilterConfig~A~ { <<trait>> } class SpecialisedMessage~A~ { <<class>> message(x: A) Seq[String]* } class BasicFilter~A~ { <<class>> } class Unexpected~A~ { <<class>> unexpected(x: A) String* } class Because~A~ { <<class>> reason(x: A) String* } class UnexpectedBecause~A~ { <<class>> reason(x: A) String* unexpected(x: A) String* }
Some filters within the Lexer
are best left as a specialised or vanilla error, which is
why the hierarchy is constrained. Other than that, the various leaf classes allow for various
combinations of adding reasons, altering the unexpected message, or bespoke error messages.
The BasicFilter
here does not attach any special error messages to the filtering, having the
effect of just using the basic filter
combinator internally.
Special Configuration
Some parts of the error configuration for the Lexer
are special. In particular, these are
preventRealDoubleDroppedZero
and the two verifiedXBadCharsUsedInLiteral
(for both Char
and String
).
These provide very hand-crafted error messages for specific scenarios, based on the ideas of
the Preventative and Verified Errors patterns.
Preventing Double-Dropped Zero
When writing floating point literals, it is, depending on the configuration, possible to write .0
,
say, or 0.
. However, it should not be possible to have the literal .
on its own! Overriding
preventRealDoubleDroppedZero
is the way to prevent this, and provide a good error message in the
process. There are a few options:
UnexpectedZeroDot
: sets an unexpected message when just.
is seen to be the given string.ZeroDotReason
: does not set an unexpected message, but adds a reason explaining why.
is illegal.UnexpectedZeroDotWithReason
: combines both above behaviours.ZeroDotFail
: throws an error with the given bespoke error messages.
Preventing Bad Characters in Literals
When writing string and character literals, some characters may be considered illegal. For instance,
a langauge may not allow "
to appear unescaped within a character literal. To help make it clear
why a character was rejected by the parser, verifiedCharBadCharsUsedInLiteral
and
verifiedStringBadCharsUsedInLiteral
allow for fine-grained error messages to be generated when
an illegal character occurs. There are a few options:
BadCharsFail
takes aMap[Int, Seq[String]]
from unicode characters to the messages to generate if one of the keys was found in the string.BadCharsReason
takes aMap[Int, String]
from unicode characters to the reason they generate if they are found in the string.Unverified
does no additional checks for bad characters.