-
Notifications
You must be signed in to change notification settings - Fork 5
Deveoloper guide
The module Language.Astview.Languages
contains a list of all languages (and thus parsers) which are known to astview.
You can append new languages right here.
See now how to define a new language.
First of all we introduce the data type for parse errors. Since parsers return different amount of error information, we distinguish between three different types of parsers:
data Error
= Err -- ^ no specific error information
| ErrMessage String -- ^ plain error message
| ErrLocation SrcSpan String -- ^ error message with position information
In order to extend astview with your own language you need to know the structure of the data type Language
, which we use to represent languages and their parsers.
data Language = Language
{ name :: String
, syntax :: String
, exts :: [String]
, parse :: String -> Either Error Ast
}
The name
is just a string for gui-issues, whereas the second attribute syntax
is the name of the syntax highlighter, which should be associated with that language. If no syntax highlighting is desired []
works for you here.
Astview uses the same syntax highlighting as gedit, so you might find the name of your language there.
The attribute exts
defines a list of file extensions which should be associated with that language.
When opening a file astview can automatically select a language based on the file extension.
For perfect automatic parser selection it is reasonable for the file extensions of all languages known to astview not to overlap.
The parse
function maps the input string either to an error value or to an abstract syntax tree.
After an input string has been parsed, one has to transform the parsed tree into our internal representation type Ast
(see documentation of Language.Astview.Language
for details on Ast
).
The module Language.Astview.DataTree
offers different type-generic functions for that purpose.
The very basic one is the function dataToAstSimpl :: Data t => t -> Ast
transforming an arbitrary value whose type implements class Data
into our internal type Ast
by just printing the constructors and storing them in a tree.
In order to simplify the tree dataToAstSimpl
represents String
s not as a list of Char
, but as a single node in the tree.
In this section we will introduce you to adding Haskell support to astview.
We use the abstract syntax and parser from package haskell-src-exts.
The name and the syntax highlighter are both the string "Haskell"
.
Although we associate both classical Haskell files ".hs"
and literate Haskell files ".lhs"
with this language.
The following code applies the parser to our file content and transforms the parsed value in the right context to fit with our data type Ast
using dataToAstSimpl
:
parsehs :: String -> Either Error Ast
parsehs s =
case parse s :: ParseResult (Module SrcSpan) of
ParseOk t -> Right $ data2AstSimpl t
ParseFailed (SrcLoc _ l c) m ->
Left $ ErrLocation (position l c) m
If the parse fails, the parser returns information about the incorrect source. We reuse this data to help the user of astview finding the faulty source positions.
Putting it all together we can now define a value of type Language
in order to support Haskell sources in astview:
haskellexts :: Language
haskellexts = Language "Haskell" "Haskell" [".hs",".lhs"] parsehs
After appending haskellexts
to the list of known languages languages
in module Language.Astview.Languages
and a reinstallation, astview can now display the abstract synax tree of Haskell files.
In order to get astview to work this source locations, a bit more work has to be done.
We now assume that the parser builds an abstract syntax tree annotated with source locations.
The function dataToAstSimpl
doesn't know which values in the tree are source locations.
Our type for source locations is defined in module Language.Astview.Language
:
data SrcPos = SrcPos { line :: Int , column :: Int }
data SrcSpan = SrcSpan { begin :: SrcPos , end :: SrcPos }
One should use the smart constructor functions span
,position
and linear
to create source locations, since they apply validity checks.
Instead of the function dataToAstSimpl
which does not support creation of source locations, we use
dataToAst :: (Data t) => (forall span.Data span => span -> Maybe SrcSpan)
-> (forall st . Typeable st => st -> Bool)
-> t -> Ast
which gets a source location selector as first argument.
The given function will be automatically applied to all nodes of the tree to extract their source location.
The target type is wrapped in Maybe
since not every node of a tree has a associated source location. The second argument is a predicate for subtrees, which should not be displayed. After annotating the subtrees with their respective source location, one sometimes does not want the subtrees representing source locations to occur in the displayed tree. For that purpose one can hand over a predicate to dataToAst
and all subtrees satisfying the predicate will not be displayed by astview.
In most of the cases one wants values of exactly one type to be removed from the tree. The function
dataToAstIgnoreByExample :: (Data t,Typeable t,Typeable b,Data b)
=> (forall a . (Data a,Typeable a) => a -> Maybe SrcSpan)
-> b -> t -> Ast
works like dataToAst
, but instead of a predicate one can define a value of an arbitrary type b
and all values of type b
will be removed from the displayed tree.
We only need to change the function parsehs
from our example above in order to add source location support to Haskell.
Since we have to care with the type for source locations from haskell-src-exts
and our internal type, we import the Haskell source locations in a qualified manner:
import qualified Language.Haskell.Exts.SrcLoc as HsSrcLoc
First of all we need to define a function, which returns the associated source location for an arbitrary node in the abstract syntax.
Thank to the structure of the abstract syntax in hasskell-src-exts
this can be done completely type-generic.
The source location is always of type SrcSpan
and can be found as the left-most subtree of a tree if existing.
We use a zipper from package syz
to go the left-most subtree and extract the source location information:
getSrcLoc :: Data t => t -> Maybe SrcSpan
getSrcLoc t = down' (toZipper t) >>= query (def `extQ` atSpan) where
def :: a -> Maybe SrcSpan
def _ = Nothing
atSpan :: HsSrcLoc.SrcSpan -> Maybe SrcSpan
atSpan (HsSrcLoc.SrcSpan _ c1 c2 c3 c4) = Just $ span c1 c2 c3 c4
To add the source location support, we now need to give getSrcLoc
as an argument to the function dataToAst
as a selector for source locations:
parsehs :: String -> Either Error Ast
parsehs s = case parse s :: ParseResult (Module HsSrcLoc.SrcSpan) of
ParseOk t -> Right $ dataToAst getSrcLoc (const False) t
ParseFailed (HsSrcLoc.SrcLoc _ l c) m -> Left $ ErrLocation (position l c) m
Using this version of parsehs
as a parse function causes astview to support jumping between associated positions in source text and abstract syntax tree.
The resulting trees will now contain all the source location information as subtrees, which are already internally stored in the Ast
. Since source locations are only metainformation to the subtrees and one can jump from subtrees to their respective position in the sources, it is not required to display source locations as nodes in the abstract syntax tree. One can simply remove source locations from the tree by using the function dataToAstIgnoreByExample
, which causes all values of type SrcSpan
to be discarded from the tree:
parsehs :: String -> Either Error Ast
parsehs s = case parse s :: ParseResult (Module HsSrcLoc.SrcSpan) of
ParseOk t -> Right $ dataToAstIgnoreByExample getSrcLoc
(undefined::HsSrcLoc.SrcSpan)
t
ParseFailed (HsSrcLoc.SrcLoc _ l c) m -> Left $ ErrLocation (position l c) m