License | BSD-3-Clause |
---|---|
Maintainer | Jamie Willis, Gigaparsec Maintainers |
Stability | stable |
Safe Haskell | Safe |
Language | Haskell2010 |
This module defines the ErrorBuilder typeclass, which specifies how to generate an error from a parser as a specified type.
An instance of this typeclass is required when calling parse
(or similar). By default, gigaparsec
defines its own instance for
ErrorBuilder String
found in this module.
To implement ErrorBuilder
, a number of methods must be defined,
as well the representation types for a variety of different components;
the relation between the various methods is closely linked
to the types that they both produce and consume. To only change
the basics of formatting without having to define the entire instance,
use the methods found in Text.Gigaparsec.Errors.DefaultErrorBuilder.
Since: 0.2.0.0
Synopsis
- class Ord (Item err) => ErrorBuilder err where
- type Position err
- type Source err
- type ErrorInfoLines err
- type ExpectedItems err
- type Messages err
- type UnexpectedLine err
- type ExpectedLine err
- type Message err
- type LineInfo err
- type Item err
- build :: Position err -> Source err -> ErrorInfoLines err -> err
- pos :: Word -> Word -> Position err
- source :: Maybe FilePath -> Source err
- vanillaError :: UnexpectedLine err -> ExpectedLine err -> Messages err -> LineInfo err -> ErrorInfoLines err
- specialisedError :: Messages err -> LineInfo err -> ErrorInfoLines err
- combineExpectedItems :: Set (Item err) -> ExpectedItems err
- combineMessages :: [Message err] -> Messages err
- unexpected :: Maybe (Item err) -> UnexpectedLine err
- expected :: ExpectedItems err -> ExpectedLine err
- reason :: String -> Message err
- message :: String -> Message err
- lineInfo :: String -> [String] -> [String] -> Word -> Word -> Word -> LineInfo err
- numLinesBefore :: Int
- numLinesAfter :: Int
- raw :: String -> Item err
- named :: String -> Item err
- endOfInput :: Item err
- unexpectedToken :: NonEmpty Char -> Word -> Bool -> Token
- data Token
How an Error is Structured
There are two kinds of error messages that are generated by gigaparsec
:
Specialised and Vanilla. These are produced by different combinators
and can be merged with other errors of the same type if both errors appear
at the same offset. However, Specialised errors will take precedence
over Vanilla errors if they appear at the same offset. The most
common form of error is the Vanilla variant, which is generated by
most combinators, except for some in Text.Gigaparsec.Errors.Combinator.
Both types of error share some common structure, namely:
- The error preamble, which has the file and the position.
- The content lines, the specifics of which differ between the two types of error.
- The context lines, which has the surrounding lines of input for contextualisation.
Vanilla Errors
There are three kinds of content line found in a Vanilla error:
- Unexpected info: this contains information about the kind of token that caused the error.
- Expected info: this contains the information about what kinds of token could have avoided the error.
- Reasons: these are the bespoke reasons that an error has occurred (as generated by
explain
).
There can be at most one unexpected line, at most one expected line, and zero or more reasons. Both of the unexpected and expected info are built up of error items, which are either: the end of input, a named token, raw input taken from the parser definition. These can all be formatted separately.
The overall structure of a Vanilla error is given in the following diagram:
┌───────────────────────────────────────────────────────────────────────┐ │ Vanilla Error │ │ ┌────────────────┐◄──────── position │ │ source │ │ │ │ │ │ line col│ │ │ ▼ │ │ ││ │ │ ┌─────┐ │ ▼ ▼│ end of input │ │ In foo.txt (line 1, column 5): │ │ │ ┌─────────────────────┐ │ │ │unexpected ─────►│ │ │ ┌───── expected │ │ │ ┌──────────┐ ◄──────────┘ │ │ │ unexpected end of input ▼ │ │ ┌──────────────────────────────────────┐ │ │ expected "(", "negate", digit, or letter │ │ │ └──────┘ └───┘ └────┘ ◄────── named│ │ │ ▲ └──────────┘ │ │ │ │ │ │ │ │ │ raw │ │ │ └─────────────────┬───────────┘ │ │ '-' is a binary operator │ │ │ └──────────────────────┘ │ │ │ ┌──────┐ ▲ │ │ │ │>3+4- │ │ expected items │ │ │ ^│ │ │ │ └──────┘ └───────────────── reason │ │ ▲ │ │ │ │ │ line info │ └───────────────────────────────────────────────────────────────────────┘
Specialised Errors
There is only one kind of content found in a Specialised error:
a message. These are completely free-form, and are generated by the
failWide
combinator, as well as its derived combinators.
There can be one or more messages in a Specialised error.
The overall structure of a Specialised error is given in the following diagram:
┌───────────────────────────────────────────────────────────────────────┐ │ Specialised Error │ │ ┌────────────────┐◄──────── position │ │ source │ │ │ │ │ │ line col │ │ ▼ │ │ │ │ │ ┌─────┐ │ ▼ ▼ │ │ In foo.txt (line 1, column 5): │ │ │ │ ┌───► something went wrong │ │ │ │ │ message ──┼───► it looks like a binary operator has no argument │ │ │ │ │ └───► '-' is a binary operator │ │ ┌──────┐ │ │ │>3+4- │ │ │ │ ^│ │ │ └──────┘ │ │ ▲ │ │ │ │ │ line info │ └───────────────────────────────────────────────────────────────────────┘
The Error Builder API
Top-Level Construction
These methods help assemble the final products of the error messages.
The build
method will return the desired err
types,
whereas specialisedError
and vanillaError
both assemble an ErrorInfoLines
that
the build
method can consume.
Error Preamble
These methods control the construction of the preamble of an error message,
consisting of the position and source info.
These are then consumed by build
itself.
Contextual Input Lines
These methods control how many lines of input surrounding the error are requested,
and direct how these should be put together to form a LineInfo
.
Shared Components
These methods control any components or structure shared by both types of messages. In particular, the representation of reasons and messages is shared, as well as how they are combined together to form a unified block of content lines.
Specialized-Specific Components
These methods control the Specialized-specific components, namely the construction of a bespoke error message.
Vanilla-Specific Components
These methods control the Vanilla-specific error components,
namely how expected error items should be combined,
how to represent the unexpected line,
and how to represent reasons generated from explain
.
Error Items
These methods control how error items within Vanilla errors are constructed.
These are either the end of input, a named label generated by the
label
combinator,
or a raw piece of input intrinsically associated with a combinator.
* Item
* endOfInput
* named
* raw
* unexpectedToken
Documentation
Error Builder
class Ord (Item err) => ErrorBuilder err where Source #
This class describes how to construct an error message generated by a parser in
a represention, err
, the parser writer desires.
The representation type of position information within the generated message.
The representation of the file information.
type ErrorInfoLines err Source #
The representation type of the main body within the error message.
type ExpectedItems err Source #
The representation of all the different possible tokens that could have prevented an error.
The representation of the combined reasons or failure messages from the parser.
type UnexpectedLine err Source #
The representation of the information regarding the problematic token.
type ExpectedLine err Source #
The representation of the information regarding the solving tokens.
The representation of a reason or a message generated by the parser.
The representation of the line of input where the error occurred.
The type that represents the individual items within the error. It must be
orderable, as it is used within Set
.
:: Position err | the representation of the position of the error in the input (see the |
-> Source err | the representation of the filename, if it exists (see the |
-> ErrorInfoLines err | the main body of the error message (see |
-> err | the final error message |
This is the top level function, which finally compiles all the built
sub-parts into a finished value of type err
.
:: Word | the line the error occurred at. |
-> Word | the column the error occurred at. |
-> Position err | a representation of the position. |
Converts a position into the representation type given by Position
.
Converts the name of the file parsed from, if it exists, into the type given by Source
.
:: UnexpectedLine err | information about which token(s) caused the error (see the |
-> ExpectedLine err | information about which token(s) would have avoided the error (see the |
-> Messages err | additional information about why the error occured (see the |
-> LineInfo err | representation of the line of input that this error occured on (see the |
-> ErrorInfoLines err |
Vanilla errors are those produced such that they have information about
both expected
and unexpected
tokens. These are usually the default,
and are not produced by fail
(or any derivative) combinators.
:: Messages err | information detailing the error (see the |
-> LineInfo err | representation of the line of input that this error occured on (see the |
-> ErrorInfoLines err |
Specialised errors are triggered by fail
and any combinators that are
implemented in terms of fail
. These errors take precedence over
the vanilla errors, and contain less, more specialised, information.
:: Set (Item err) | the possible items that fix the error. |
-> ExpectedItems err |
Details how to combine the various expected items into a single representation.
Details how to combine any reasons or messages generated within a
single error. Reasons are used by vanilla
messages and messages
are used by specialised
messages.
:: Maybe (Item err) | the |
-> UnexpectedLine err |
Describes how to handle the (potentially missing) information about what token(s) caused the error.
:: ExpectedItems err | the tokens that could have prevented the error (see |
-> ExpectedLine err |
Describes how to handle the information about the tokens that could have avoided the error.
Describes how to represent the reasons behind a parser fail.
These reasons originate from the explain
combinator.
Describes how to represent the messages produced by the
fail
combinator (or any that are implemented using it).
:: String | the full line of input that produced this error message. |
-> [String] | the lines of input from just before the one that produced this message (up to |
-> [String] | the lines of input from just after the one that produced this message (up to |
-> Word | the line number of the error message |
-> Word | the offset into the line that the error points at. |
-> Word | how wide the caret in the message should be. |
-> LineInfo err |
Describes how to process the information about the line that the error occured on, and its surrounding context.
numLinesBefore :: Int Source #
The number of lines of input to request before an error occured.
numLinesAfter :: Int Source #
The number of lines of input to request after an error occured.
Converts a raw item generated by either the input string or a input reading combinator without a label.
Converts a named item generated by a label.
endOfInput :: Item err Source #
Value that represents the end of the input in the error message.
:: NonEmpty Char | the remaining input, |
-> Word | the input the parser tried to read when it failed
(this is not guaranteed to be smaller than the length of
|
-> Bool | was this error generated as part of "lexing", or in a wider parser (see |
-> Token | a token extracted from |
Extracts an unexpected token from the remaining input.
When a parser fails, by default an error reports an unexpected token of a specific width. This works well for some parsers, but often it is nice to have the illusion of a dedicated lexing pass: instead of reporting the next few characters as unexpected, an unexpected token can be reported instead. This can take many forms, for instance trimming the token to the next whitespace, only taking one character, or even trying to lex a token out of the stream.
This method can be easily implemented by using an appropriate token extractor from Text.Gigaparsec.Errors.TokenExtractors.