Hege

View the Project on GitHub alicelambda/HEGE

Implementation
- Parser
Language Features
- Overview
- Installation and Usage
- Primitive Functions
- Program Flow
- Defining Functions and Variables
- Input Output functions

Blog Home

Parser

The Hege parser takes Hege source code and parses it into a list of LispVals. The definition of a LispVal is shown below.

-- | Core Datatypes of LispVal
data LispVal = Atom String 
             | List [LispVal]
             | DottedList [LispVal] LispVal
             | Number Integer
             | String String
             | Bool Bool
             | Char Char
             | Float Double
             | PrimitiveFunc ([LispVal] -> ThrowError LispVal)
             | Func { params  :: [String],
                      vararg  :: (Maybe String),
                      body    :: [LispVal],
                      closure :: Env}
             | IOFunc ([LispVal] -> IOThrowsError LispVal)
             | Port Handle

The most basic parser is the parseNum parser. It is written using the parsec library.

-- | Parses regular number
parseNum :: Parser LispVal         
parseNum  = do
  digits <- many1 digit
  return $ Number $ read digits

parseNum matches a string of digits and then returns it as the Number LispVal.

parseOct :: Parser LispVal
parseOct = do
  _ <- string "#o"
  octNum <- many1 (oneOf "01234567")
  return $ Number $ fst (readOct octNum !! 0)

parseOct is similar to the number parser except it only matches on the digits from 0-7. It then converts the octal number to a decimal number and returns it as Number LispVal.

Finally we have the hexadecimal parser parseHex which looks very similiar to the parseOct parser except it matches against 0-9 and a-f.

-- | Parses hex number
parseHex :: Parser LispVal
parseHex = do
  _ <- string "#h"
  hexNum <- many1 (digit <|> oneOf "abcdef")
  return $ Number $ fst (readHex hexNum !! 0)

We can then put these parsers together using <|> to make the more general parser parseInt. This allows us to run one parser that matches against any of the three number representations. The reason we need try is because both parseHex and ParseOct consume the # character first. Try means that we try to parse the characters and if it fails it doesn’t consume any input. For example if a number starts with #h the parseOct function will match on the ‘#’ but then fail on the ‘h’ an f in octal number parseOct fails but doesn’t consume because it’s wrapped in a try. Which is important because we need to keep the ‘#’ character because parseHex is expecting it. The default parser without try consumes input even if it fails.

-- | Parses general number
parseInt :: Parser LispVal
parseInt = try parseOct
   <|> try parseHex
   <|> parseNum