-
-
Notifications
You must be signed in to change notification settings - Fork 50
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
preprocessor/lexer/parser: Massive refactor of parser pipeline, added…
… import statement (#76) * preprocessor/lexer/parser: Refactor of pattern language parsing pipeline (#74) * lexer: update token creation and remove tokens dependency from ast nodes * lexer: refactor lexer, add nan and inf and general overhaul * parser: switch to folds over recursion * parser: remove all unnecessary MATCHES macro expansions * tests: add doc comments and strings test * parser/lexer: fixes and stability * preprocessor/lexer/errors: implement new error system and fragments of new parser system * misc: remove compiler explorer plugin settings * build: update libfmt * tokens/errors/lexer: small refactor and lift of `Location` * pattern language: make source required to be passed * preprocessor: calculate location correctly * preprocessor: fix include resolver not being passed correctly * error: improved error collection capabilities * all: new resolver model and partial error implementation * preprocessor: refactor * error: support printing compile errors * pipeline: small fixes * parser/preprocessor/lexer: small changes to api and fixes * pattern_language: fixes to support new apis * preprocessor/lexer: improve lexer and preprocessor and add length to locations * token: make tokens comparable with literals * resolver: rework resolvers & parser manager * parser: simple cleanup & improvements * build: Improve build times by removing std::chrono from headers * fix: small parser bug * fix: small parser fixes * fix: small mistake * parser: switch parser over to new error model * ast: migrate ASTNode over to Location * clean: remove old unused errors * validator: refactor validator & mirgrate to new error system * misc: small refactor, move evaluator over to new runtime errors * build: update depends * misc: remove unused code * fix: remove from cmake lists * misc: apply suggested changes & rework of source resolving * doc: document functions * misc: better name * fix: add back old `bitfield_order` pragma * fix: small bug * impr: add printing of compile errors to tests * misc: apply review suggestions * misc: apply review suggestions * style: adapted style * lib: Add function to lex a string into tokens * misc: fix bugs and correct error locations * parser: Fix various crashes during parsing * lib: Fix crash when error occurred during preprocessing or lexing * lib: Fix integer underflow when formatting error messages * parser: Fixed doc comment parsing * lib: Make sure resolvers and preprocessor are reset correctly * parse: Improve handling of infinite loops during error handling * fix: solve iterator issues by adding safe iterator * fix: incorrect column numbers * fix: avoid implicit casts by using diff type * fix: Compile and linking errors when using clang * feat: added fuzzing utilities * feat: add fuzzing dictionary * fix: update fuzzing instructions * parser: Use safe pointers * misc: move safe pointer to own header * misc: add `IteratorLike` concept to SafeIterator * misc: update libwolv * fix: fix alot of crashes * fix: match not check nullptr * fix: try-catch not checking for null * fix: more null-checks * parser: Fix build errors on clang * parser: Try to circumvent ICE * misc: header optimization * misc: rework error display * feat: import statement * misc: Fix merge conflicts * tests: Fix compile issues * fix: once import not accounting for different aliases * fix: Source::Empty() function being constexpr --------- Co-authored-by: WerWolv <[email protected]>
- Loading branch information
1 parent
82cce31
commit 188a808
Showing
76 changed files
with
4,025 additions
and
2,101 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
cmake-build-*/ | ||
build/ | ||
.idea/compilerexplorer.settings.xml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
.gdb_history | ||
plfuzz | ||
sync/ | ||
output/ | ||
graph/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
cmake_minimum_required(VERSION 3.16) | ||
project(plfuzz) | ||
|
||
set(CMAKE_CXX_STANDARD 23) | ||
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}) | ||
|
||
add_executable(plfuzz source/main.cpp) | ||
|
||
target_link_libraries(plfuzz PRIVATE libpl libpl-gen fmt::fmt-header-only) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
## Pattern Language Fuzzing | ||
Small subproject in the pattern language to allow fuzzing the parser for crashes and other issues. | ||
|
||
### Pre-requisites | ||
To use the fuzzer, you must first have the [AFL++](https://github.com/AFLplusplus/AFLplusplus) fuzzer installed. | ||
Follow the instructions on their repository on how to build the afl-fuzz and afl-cc binaries. | ||
Keep in mind that if you are compiling from source, the af-cc/afl-c++ binaries must be compiled with atleast | ||
clang-17 support. | ||
|
||
### Building | ||
To build the fuzzer you must set the compiler to afl-cc/afl-c++. To achieve this, you need to alter the cmake flags to | ||
include the following: | ||
```bash | ||
-DCMAKE_C_COMPILER=afl-cc -DCMAKE_CXX_COMPILER=afl-c++ -DLIBPL_ENABLE_FUZZING=ON | ||
``` | ||
After this, you can build the fuzzer as normal. | ||
The binary will be in this source folder. | ||
|
||
### Fuzzing | ||
The plfuzz binary takes in a file to parse as an argument. | ||
|
||
To fuzz you can now follow the AFL++ tutorials on how to effectively fuzz. | ||
There are some simple inputs in the `inputs` folder that you can use to start fuzzing. | ||
There is also a dictionary file in the `dict` folder that you can use to improve the quality of the fuzzing. | ||
|
||
Here is an example of how to start fuzzing: | ||
```bash | ||
afl-fuzz -i inputs -o output -x ./dict/hexpat.dict -- ./plfuzz @@ | ||
``` | ||
This will run a simple fuzzing session with the inputs in the `inputs` folder and outputting to the `output` folder. | ||
|
||
### Debugging | ||
During the session, if the fuzzer finds crashes or halts, it will output the crashing input to the | ||
`output/crashes` or `output/hangs` folder. | ||
To debug these cases simply run the plfuzz binary with the file as an argument: | ||
```bash | ||
./plfuzz output/crashes/<crash_file> | ||
``` | ||
And you can attach GDB to the process to debug the crash, like so: | ||
```bash | ||
gdb -- ./plfuzz output/crashes/<crash_file> | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
# | ||
# AFL dictionary for pattern language hexpat format | ||
# | ||
|
||
keyword_if="if" | ||
keyword_else="else" | ||
keyword_while="while" | ||
keyword_for="for" | ||
keyword_match="match" | ||
keyword_return="return" | ||
keyword_break="break" | ||
keyword_continue="continue" | ||
keyword_struct="struct" | ||
keyword_enum="enum" | ||
keyword_union="union" | ||
keyword_function="fn" | ||
keyword_bitfield="bitfield" | ||
keyword_unsigned="unsigned" | ||
keyword_signed="signed" | ||
keyword_little_endian="le" | ||
keyword_big_endian="be" | ||
keyword_parent="parent" | ||
keyword_namespace="namespace" | ||
keyword_using="using" | ||
keyword_this="this" | ||
keyword_in="in" | ||
keyword_out="out" | ||
keyword_reference="ref" | ||
keyword_null="null" | ||
keyword_const="const" | ||
keyword_try="try" | ||
keyword_catch="catch" | ||
keyword_import="import" | ||
keyword_as="as" | ||
keyword_is="is" | ||
|
||
misc_1=" 1" | ||
misc_a="a" | ||
misc_type="u8" | ||
misc_decl="u8 a=1" | ||
misc_assign=" a=1" | ||
misc_code=" {}" | ||
misc_string="\"a\"" | ||
misc_comment="//" | ||
misc_comment2="/* */" | ||
misc_comment3="/** */" | ||
misc_comment4="/*! */" | ||
misc_comment5="///" | ||
misc_comment6="//!" | ||
misc_comment7="/**" | ||
misc_minus=" -" | ||
misc_plus=" +" | ||
misc_div=" /" | ||
misc_mul=" *" | ||
misc_mod=" %" | ||
misc_and=" &" | ||
misc_or=" |" | ||
misc_xor=" ^" | ||
misc_not=" !" | ||
misc_lshift=" <<" | ||
misc_rshift=" >>" | ||
misc_eq=" =" | ||
misc_neq=" !=" | ||
misc_lt=" <" | ||
misc_gt=" >" | ||
misc_leq=" <=" | ||
misc_geq=" >=" | ||
misc_and2=" &&" | ||
misc_or2=" ||" | ||
misc_unicode_char="'\\u0000'" | ||
misc_hex_char="'\\x00'" | ||
misc_member_access="." | ||
misc_namespace_access="::" | ||
misc_bitfield_size=": 1," | ||
misc_enum_entry="= 1," | ||
misc_function_call="()" | ||
misc_function_paren="( )" | ||
misc_struct_decl="struct a {}" | ||
misc_enum_decl="enum a : u8 {}" | ||
misc_union_decl="union a {}" | ||
misc_bitfield_decl="bitfield a : u8 {}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
u8 a; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
struct A { | ||
u8 a; | ||
u8 b; | ||
u8 c; | ||
}; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
u8 a = 43 [[export]]; | ||
u8 c = 11 [[hidden]]; | ||
u8 d @ 0x00 [[export]]; | ||
|
||
u8 a[while(0)] @ 0x00 [[export]]; | ||
|
||
struct A { | ||
u8 a [[export]]; | ||
u8 c [[hidden]]; | ||
} [[format("A")]]; | ||
|
||
enum B : u8 { | ||
A = 43; | ||
B = 432, | ||
C = 62 | ||
}; | ||
|
||
bitfield C { | ||
u8 a : 23 [[export]]; | ||
u8 c : 252 [[hidden]]; | ||
} [[format("A")]]; | ||
|
||
fn test() { | ||
a = 3; | ||
c = 4; | ||
}; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#define A | ||
#define B | ||
#ifdef A | ||
#define C | ||
u8 a; | ||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
struct A {}; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
#include <pl/pattern_language.hpp> | ||
#include <wolv/io/file.hpp> | ||
|
||
#include <iostream> | ||
#include <pl/helpers/utils.hpp> | ||
#include <wolv/utils/string.hpp> | ||
|
||
int main(int argc, char **argv) { | ||
if(argc < 2) { | ||
std::cout << "Invalid number of arguments specified! " << argc << std::endl; | ||
return EXIT_FAILURE; | ||
} | ||
|
||
std::fs::path path; | ||
|
||
if(strncmp(argv[1], "-t", 2) == 0) { | ||
if(argc != 3 ) { | ||
std::cout << "Invalid number of arguments specified! " << argc << std::endl; | ||
return EXIT_FAILURE; | ||
} | ||
std::string base = argv[2]; // base path | ||
std::cout << "Number: " << std::endl; | ||
int number = 0; | ||
std::cin >> number; | ||
std::vector<std::fs::path> paths; | ||
for(auto &file : std::filesystem::directory_iterator(base)) { | ||
paths.push_back(file.path()); | ||
} | ||
// sort by name | ||
std::ranges::sort(paths, [](const std::fs::path &a, const std::fs::path &b) { | ||
return a.filename().string() < b.filename().string(); | ||
}); | ||
path = paths[number]; | ||
std::cout << "Executing: " << path << std::endl; | ||
} else { | ||
path = argv[1]; | ||
} | ||
|
||
wolv::io::File patternFile(path, wolv::io::File::Mode::Read); | ||
|
||
pl::PatternLanguage runtime; | ||
|
||
auto result = | ||
runtime.parseString(patternFile.readString(), wolv::util::toUTF8String(path)); | ||
|
||
return EXIT_SUCCESS; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.