Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various (very good) improvements #496

Merged
merged 15 commits into from
Sep 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@
- added resolving `empty?` as a macro when possible
- added short-circuiting to `and` and `or` implementation
- added `--check` to the formatter as an option: returns 0 if the code is correctly formatted, 1 otherwise
- the name & scope resolution pass now checks for mutability errors
- compile time checks for mutability errors with `append!`, `concat!` and `pop!`
- new `MAKE_CLOSURE <page addr>` instruction, generated in place of a `LOAD_CONST` when a closure is made

### Changed
- instructions are on 4 bytes: 1 byte for the instruction, 1 byte of padding, 2 bytes for an immediate argument
Expand Down Expand Up @@ -74,6 +77,10 @@
- repl completion and colors are now generated automatically from the builtins, keywords & operators
- fixed formating of comments inside function declarations
- renamed the macros `symcat` and `argcount` to `$symcat` and `$argcount` for uniformity
- the `Ark::VM` class is now `final`
- the `STORE` instruction has been renamed `SET_VAL`
- the `STORE` instruction is emitted in place of the `LET` and `MUT` instructions, without any mutability checking now
- `io:writeFile` no longer takes a mode and has been split into `io:writeFile` and `io:appendToFile`

### Removed
- removed unused `NodeType::Closure`
Expand All @@ -85,6 +92,8 @@
- removed useless `\0` escape in strings
- removed `termcolor` dependency to rely on `fmt` for coloring outputs
- removed `and` and `or` instructions in favor of a better implementation to support short-circuiting
- removed `LET` and `MUT` instructions in favor of a single new `STORE` instruction
- removed `SAVE_ENV` instruction

## [3.5.0] - 2023-02-19
### Added
Expand Down
21 changes: 11 additions & 10 deletions include/Ark/Builtins/Builtins.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,16 +45,17 @@ namespace Ark::internal::Builtins

namespace IO
{
Value print(std::vector<Value>& n, VM* vm); // print, multiple arguments
Value puts_(std::vector<Value>& n, VM* vm); // puts, multiple arguments
Value input(std::vector<Value>& n, VM* vm); // input, 0 or 1 argument
Value writeFile(std::vector<Value>& n, VM* vm); // io:writeFile, 2 or 3 arguments
Value readFile(std::vector<Value>& n, VM* vm); // io:readFile, 1 argument
Value fileExists(std::vector<Value>& n, VM* vm); // io:fileExists?, 1 argument
Value listFiles(std::vector<Value>& n, VM* vm); // io:listFiles, 1 argument
Value isDirectory(std::vector<Value>& n, VM* vm); // io:isDir?, 1 argument
Value makeDir(std::vector<Value>& n, VM* vm); // io:makeDir, 1 argument
Value removeFiles(std::vector<Value>& n, VM* vm); // io:removeFiles, multiple arguments
Value print(std::vector<Value>& n, VM* vm); // print, multiple arguments
Value puts_(std::vector<Value>& n, VM* vm); // puts, multiple arguments
Value input(std::vector<Value>& n, VM* vm); // input, 0 or 1 argument
Value writeFile(std::vector<Value>& n, VM* vm); // io:writeFile, 2 arguments
Value appendToFile(std::vector<Value>& n, VM* vm); // io:appendToFile, 2 arguments
Value readFile(std::vector<Value>& n, VM* vm); // io:readFile, 1 argument
Value fileExists(std::vector<Value>& n, VM* vm); // io:fileExists?, 1 argument
Value listFiles(std::vector<Value>& n, VM* vm); // io:listFiles, 1 argument
Value isDirectory(std::vector<Value>& n, VM* vm); // io:isDir?, 1 argument
Value makeDir(std::vector<Value>& n, VM* vm); // io:makeDir, 1 argument
Value removeFiles(std::vector<Value>& n, VM* vm); // io:removeFiles, multiple arguments
}

namespace Time
Expand Down
98 changes: 78 additions & 20 deletions include/Ark/Compiler/AST/BaseParser.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@

namespace Ark::internal
{
/**
* @brief Describe a position in a given file ; handled by the BaseParser
*/
struct FilePosition
{
std::size_t row = 0;
Expand All @@ -24,16 +27,22 @@ namespace Ark::internal

private:
std::string m_str;
std::vector<std::pair<std::string::iterator, std::size_t>> m_it_to_row;
std::vector<std::pair<std::string::iterator, std::size_t>> m_it_to_row; ///< A crude map of \n position to line number to speed up line number computing
std::string::iterator m_it, m_next_it;
utf8_char_t m_sym;
FilePosition m_filepos;

utf8_char_t m_sym; ///< The current utf8 character we're on
FilePosition m_filepos; ///< The position of the cursor in the file

/**
* @brief Register the position of a new line, with an iterator pointing to the new line and the row number
*
* @param it
* @param row
*/
void registerNewLine(std::string::iterator it, std::size_t row);

/*
getting next character and changing the values of count/row/col/sym
*/
/**
* @brief getting next character and changing the values of count/row/col/sym
*/
void next();

protected:
Expand All @@ -43,35 +52,69 @@ namespace Ark::internal

FilePosition getCursor() const;

/**
*
* @param error an error message
* @param exp the expression causing the error
*/
void error(const std::string& error, std::string exp);

/**
* @brief Fetch the next token (space and paren delimited) to generate an error
*
* @param message an error message
*/
void errorWithNextToken(const std::string& message);

/**
* @brief Generate an error for a given node when a suffix is missing
*
* @param suffix a suffix char, eg " or )
* @param node_name can be "string", "node" ; represents a structure
*/
void errorMissingSuffix(char suffix, const std::string& node_name);

/**
*
* @return distance in characters from the beginning of the file to the cursor
*/
long getCount() { return std::distance(m_str.begin(), m_it); }

/**
*
* @return file size in bytes
*/
std::size_t getSize() const { return m_str.size(); }
bool isEOF() { return m_it == m_str.end(); }

/**
*
* @return true if the cursor is positioned at the end of the file
*/
[[nodiscard]] bool isEOF() const { return m_it == m_str.end(); }

void backtrack(long n);

/*
Function to use and check if a Character Predicate was able to parse
the current symbol.
Add the symbol to the given string (if there was one) and call next()
*/
/**
* @brief check if a Character Predicate was able to parse, call next() if matching
*
* @param t a char predicate to match
* @param s optional string to append the matching chars to
* @return true if matched
*/
bool accept(const CharPred& t, std::string* s = nullptr);

/*
Function to use and check if a Character Predicate was able to parse
the current Symbol.
Add the symbol to the given string (if there was one) and call next().
Throw a CodeError if it couldn't.
*/
/**
* @brief heck if a Character Predicate was able to parse, call next() if matching ; throw a CodeError if it doesn't match
* @param t a char predicate to match
* @param s optional string to append the matching chars to
* @return true if matched
*/
bool expect(const CharPred& t, std::string* s = nullptr);

// basic parsers

bool space(std::string* s = nullptr);
bool inlineSpace(std::string* s = nullptr);
bool endOfLine(std::string* s = nullptr);
bool comment(std::string* s = nullptr);
bool spaceComment(std::string* s = nullptr);
bool newlineOrComment(std::string* s = nullptr);
Expand All @@ -83,8 +126,23 @@ namespace Ark::internal
bool name(std::string* s = nullptr);
bool sequence(const std::string& s);
bool packageName(std::string* s = nullptr);

/**
* @brief Match any char that do not match the predicate
*
* @param delim delimiter predicate
* @param s optional string to append the matching chars to
* @return true if matched
*/
bool anyUntil(const CharPred& delim, std::string* s = nullptr);

/**
* @brief Fetch a token and try to match one of the given words
*
* @param words list of words to match against
* @param s optional string to append the matching chars to
* @return true if matched
*/
bool oneOf(std::initializer_list<std::string> words, std::string* s = nullptr);
};
}
Expand Down
8 changes: 8 additions & 0 deletions include/Ark/Compiler/AST/Import.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,21 @@ namespace Ark::internal
*/
std::vector<std::string> symbols;

/**
*
* @return a package string, eg a.b.c
*/
[[nodiscard]] std::string toPackageString() const
{
return std::accumulate(package.begin() + 1, package.end(), package.front(), [](const std::string& left, const std::string& right) {
return left + "." + right;
});
}

/**
*
* @return a package as a path, eg a/b/c
*/
[[nodiscard]] std::string packageToPath() const
{
return std::accumulate(
Expand Down
21 changes: 16 additions & 5 deletions include/Ark/Compiler/AST/Node.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
* @file Node.hpp
* @author Alexandre Plateau ([email protected])
* @brief AST node used by the parser, optimizer and compiler
* @version 0.4
* @version 1.0
* @date 2020-10-27
*
* @copyright Copyright (c) 2020-2021
* @copyright Copyright (c) 2020-2024
*
*/

Expand Down Expand Up @@ -38,9 +38,7 @@ namespace Ark::internal
explicit Node(NodeType node_type);
explicit Node(double value);
explicit Node(long value);
explicit Node(int value);
explicit Node(Keyword value);
explicit Node(const std::vector<Node>& nodes);

/**
* @brief Return the string held by the value (if the node type allows it)
Expand Down Expand Up @@ -134,6 +132,11 @@ namespace Ark::internal
*/
Node& attachNearestCommentBefore(const std::string& comment);

/**
* @brief Set the comment_after field with the nearest comment after this node
* @param comment
* @return Node& reference to this node after updating it
*/
Node& attachCommentAfter(const std::string& comment);

/**
Expand Down Expand Up @@ -163,6 +166,10 @@ namespace Ark::internal
*/
[[nodiscard]] const std::string& comment() const noexcept;

/**
* @brief Return the comment attached after this node, if any
* @return const std::string&
*/
[[nodiscard]] const std::string& commentAfter() const noexcept;

/**
Expand All @@ -180,7 +187,6 @@ namespace Ark::internal

friend bool operator==(const Node& A, const Node& B);
friend bool operator<(const Node& A, const Node& B);
friend bool operator!(const Node& A);

private:
NodeType m_type;
Expand All @@ -197,6 +203,11 @@ namespace Ark::internal
const Node& getNilNode();
const Node& getListNode();

/**
*
* @param node
* @return std::string a string corresponding to the node type
*/
inline std::string typeToString(const Node& node) noexcept
{
if (node.nodeType() == NodeType::Symbol)
Expand Down
46 changes: 43 additions & 3 deletions include/Ark/Compiler/AST/Parser.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,20 +37,38 @@ namespace Ark::internal
*/
explicit Parser(unsigned debug, bool interpret = true);

/**
* @brief Parse the given code
* @param filename can be left empty, used for error generation
* @param code content of the file
*/
void process(const std::string& filename, const std::string& code);

/**
*
* @return const Node& resulting AST after processing the given code
*/
[[nodiscard]] const Node& ast() const noexcept;

/**
*
* @return const std::vector<Import>& list of imports detected by the parser
*/
[[nodiscard]] const std::vector<Import>& imports() const;

private:
bool m_interpret;
bool m_interpret; ///< interpret escape codes in strings
Logger m_logger;
Node m_ast;
std::vector<Import> m_imports;
unsigned m_allow_macro_behavior; ///< Toggled on when inside a macro definition, off afterward

void run();

/**
* @brief Update a node given a file position
* @param node node to update
* @param cursor the node position in file
* @return Node& the modified node
*/
Node& setNodePosAndFilename(Node& node, const std::optional<FilePosition>& cursor = std::nullopt) const;

std::optional<Node> node();
Expand Down Expand Up @@ -236,9 +254,31 @@ namespace Ark::internal
return { Node(NodeType::List).attachNearestCommentBefore(comment) };
}

/**
* @brief Try to parse an atom (number, string, spread, field, symbol, nil)
* @return std::optional<Node> std::nullopt if no atom could be parsed
*/
std::optional<Node> atom();

/**
* @brief Try to parse an atom, if any, match its type against the given list
* @param types autorized types
* @return std::optional<Node> std::nullopt if the parsed atom didn't match the given types
*/
std::optional<Node> anyAtomOf(std::initializer_list<NodeType> types);

/**
* @brief Try to parse an atom first, if it fails try to parse a node
* @return std::optional<Node> std::nullopt if no atom or node could be parsed
*/
std::optional<Node> nodeOrValue();

/**
* @brief Try to parse using a given parser, prefixing and suffixing it with (...), handling comments around the parsed node
* @param parser parser method returning a std::optional<Node>
* @param name construction name, eg "let", "condition"
* @return std::optional<Node> std::nullopt if the parser didn't match
*/
std::optional<Node> wrapped(std::optional<Node> (Parser::*parser)(), const std::string& name);
};
}
Expand Down
16 changes: 13 additions & 3 deletions include/Ark/Compiler/AST/utf8_char.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,17 @@ namespace Ark::internal
utf8_char_t() :
m_codepoint(0), m_length(0), m_repr({ 0 }) {}

utf8_char_t(codepoint_t cp, length_t len, repr_t repr) :
utf8_char_t(const codepoint_t cp, const length_t len, const repr_t repr) :
m_codepoint(cp), m_length(len), m_repr(repr) {}

// https://github.com/sheredom/utf8.h/blob/4e4d828174c35e4564c31a9e35580c299c69a063/utf8.h#L1178
static std::pair<std::string::iterator, utf8_char_t> at(std::string::iterator it, const std::string::iterator end)
/**
* @brief Parse a codepoint and compute its length and representation
* @details https://github.com/sheredom/utf8.h/blob/4e4d828174c35e4564c31a9e35580c299c69a063/utf8.h#L1178
* @param it iterator in a string
* @param end end iterator, used to avoid going out of bound
* @return std::pair<std::string::iterator, utf8_char_t> the iterator points to the beginning of the next codepoint, the utf8_char_t represents the parsed codepoint
*/
static std::pair<std::string::iterator, utf8_char_t> at(const std::string::iterator it, const std::string::iterator end)
{
codepoint_t cp;
length_t length;
Expand Down Expand Up @@ -72,6 +78,10 @@ namespace Ark::internal
utf8_char_t(cp, length, repr));
}

/**
*
* @return true if the given codepoint is printable according to std::isprint
*/
[[nodiscard]] bool isPrintable() const
{
if (m_codepoint < std::numeric_limits<char>::max())
Expand Down
Loading
Loading