Configuring Errors (parsley.token.errors)
The default error messages generated by the parsers produced by
the Lexer are ok, but can be much improved.
errors.ErrorConfig
The ErrorConfig class is where all the
configuration for error messages generated by the Lexer resides. Everything in
this class will have a default implementation (nothing is abstract); this ensures
easy backwards compatibility. Each of the configurations
inside takes one of the following forms:
- A plain
Stringargument, usually indicating a name of a compulsory label. - A
LabelConfig, which can either be unconfigured, hidden, or a regular label name. - A
LabelWithExplainConfig, which augments the previous configuration to also allow for a reason to be added, if desired. - A
FilterConfig, or one of its specific subtypes, which can be used to handle the messages for ill-conforming data. - A special configuration, which is used for very specific error messages, usually arising from one of the more advanced error patterns (see Advanced Error Messages)
Configuring Labels and Explains
Labels are one of the most common additional error configurations that can be applied throughout the pre-made lexer parsers. Some, but not all, of these labels can be configured to also produce a reason if the configuree cannot be parsed (either for why it should be there or what it requires). The hierarchy of components is visualised by the following UML diagram:
classDiagram
LabelWithExplainConfig --|> LabelConfig
LabelWithExplainConfig --|> ExplainConfig
LabelConfig --|> Label
LabelConfig --|> Hidden
ExplainConfig --|> Reason
LabelConfig --|> NotConfigured
LabelWithExplainConfig --|> NotConfigured
ExplainConfig --|> NotConfigured
LabelWithExplainConfig --|> LabelAndReason
class LabelWithExplainConfig { <<trait>> }
class LabelConfig { <<trait>> }
class ExplainConfig { <<trait>> }
class Label {
<<class>>
label: Seq[String]*
}
class Hidden { <<object>> }
class NotConfigured { <<object>> }
class Reason {
<<class>>
reason: String*
}
class LabelAndReason {
<<class>>
label: String*
reason: String*
}
Broadly, a component may either be marked as a LabelWithExplainConfig, which
means it can contain either labels, reasons, or both; LabelConfig if a reason
wouldn't make sense; and a ReasonConfig if it does not make sense to name.
Configuring Labels
Adding a label can be one of the following:
Label: this labels the corresponding parser with one or more labels -- this also applies forLabelAndReason.Hidden: this suppresses any error messages arising from the corresponding parser.NotConfigured: this doesn't alter the error messages from the corresponding parser.
Adding Explanations
Adding an explanation can be one of the following:
Reason: this adds a reason for the corresponding parser though doesn't change the labelling -- unlessLabelAndReasonis used instead.NotConfigured: this doesn't alter the error messages from the corresponding parser.
Configuring Filtering
Some parsers perform filtering on their results, for instance checking if a numeric literal
is within a certain bit-width. The messages generated when these filters fail is controled
by the FilterConfig[A], where A is the type of value being filtered. The below diagram
shows how the various sub-configurations are laid out.
classDiagram
FilterConfig~A~ --|> VanillaFilterConfig~A~
FilterConfig~A~ --|> SpecialisedFilterConfig~A~
VanillaFilterConfig~A~ --|> Because~A~
VanillaFilterConfig~A~ --|> Unexpected~A~
VanillaFilterConfig~A~ --|> UnexpectedBecause~A~
VanillaFilterConfig~A~ --|> BasicFilter~A~
SpecialisedFilterConfig~A~ --|> BasicFilter~A~
SpecialisedFilterConfig~A~ --|> SpecialisedMessage~A~
class FilterConfig~A~ { <<trait>> }
class VanillaFilterConfig~A~ { <<trait>> }
class SpecialisedFilterConfig~A~ { <<trait>> }
class SpecialisedMessage~A~ {
<<class>>
message(x: A) Seq[String]*
}
class BasicFilter~A~ { <<class>> }
class Unexpected~A~ {
<<class>>
unexpected(x: A) String*
}
class Because~A~ {
<<class>>
reason(x: A) String*
}
class UnexpectedBecause~A~ {
<<class>>
reason(x: A) String*
unexpected(x: A) String*
}
Some filters within the Lexer are best left as a specialised or vanilla error, which is
why the hierarchy is constrained. Other than that, the various leaf classes allow for various
combinations of adding reasons, altering the unexpected message, or bespoke error messages.
The BasicFilter here does not attach any special error messages to the filtering, having the
effect of just using the basic filter combinator internally.
Special Configuration
Some parts of the error configuration for the Lexer are special. In particular, these are
preventRealDoubleDroppedZero and the two verifiedXBadCharsUsedInLiteral (for both Char and String).
These provide very hand-crafted error messages for specific scenarios, based on the ideas of
the Preventative and Verified Errors patterns.
Preventing Double-Dropped Zero
When writing floating point literals, it is, depending on the configuration, possible to write .0,
say, or 0.. However, it should not be possible to have the literal . on its own! Overriding
preventRealDoubleDroppedZero is the way to prevent this, and provide a good error message in the
process. There are a few options:
UnexpectedZeroDot: sets an unexpected message when just.is seen to be the given string.ZeroDotReason: does not set an unexpected message, but adds a reason explaining why.is illegal.UnexpectedZeroDotWithReason: combines both above behaviours.ZeroDotFail: throws an error with the given bespoke error messages.
Preventing Bad Characters in Literals
When writing string and character literals, some characters may be considered illegal. For instance,
a langauge may not allow " to appear unescaped within a character literal. To help make it clear
why a character was rejected by the parser, verifiedCharBadCharsUsedInLiteral and
verifiedStringBadCharsUsedInLiteral allow for fine-grained error messages to be generated when
an illegal character occurs. There are a few options:
BadCharsFailtakes aMap[Int, Seq[String]]from unicode characters to the messages to generate if one of the keys was found in the string.BadCharsReasontakes aMap[Int, String]from unicode characters to the reason they generate if they are found in the string.Unverifieddoes no additional checks for bad characters.