Codegen refactor #4

clickingbuttons · 2023-06-13T02:10:38Z

Still left:

Summarize architecture and changes here in PR
Packing fields that require offsets (String, Vector, Object, Struct)
Respect namespaces in .fbs files

travisstaloch · 2023-06-13T05:02:38Z

I don't want to discourage you since I know this isn't complete. But this PR seems like more of a different way of doing things than a solution so far.

Serious issues

1. `--no-gen-object-api` still imports object API `...T` types and `Unpack` functions

this (admittedly broken) option has been removed, not fixed.

Property name collisions with unions can lead to generated code calling non-existent functions. ...

i'm not sure if this has been addressed yet.

i would encourage you to try and focus on fixing problems by making minimal changes to the codebase. then the codegen rewrite can be addressed in another PR. as-is this PR mixes a rewrite with fixing api issues. it is too much for me understand and maintain.

clickingbuttons · 2023-06-13T16:24:48Z

This PR will close ALL the issues in #3. Property name collisions have been solved by only offering unpacking the entire union type instead of it's tag and union value separately. Again, this is at the same cost as slices.

i would encourage you to try and focus on fixing problems by making minimal changes to the codebase.

I want to fix more problems than what I've listed. I considered surgically changing the current codebase, but I'd only be able to solve a few of the problems with the current low-context architecture. Variable, field, type, and function naming for instance is one thing that would require substantial changes to be able to handle user-configurable casing and name collision correctly. I think a faster -2.7k/+1.2k with solving other issues is better than a +400/-200 that has to be cleaned up later in order to be readable.

as-is this PR mixes a rewrite with fixing api issues. it is too much for me understand and maintain.

How can I help you understand and maintain it? I plan on documenting it before requesting review. I'd like to avoid making a fork.

travisstaloch · 2023-06-14T01:10:15Z

I want to fix more problems than what I've listed. I considered surgically changing the current codebase, but I'd only be able to solve a few of the problems with the current low-context architecture. Variable, field, type, and function naming for instance is one thing that would require substantial changes to be able to handle user-configurable casing and name collision correctly. I think a faster -2.7k/+1.2k with solving other issues is better than a +400/-200 that has to be cleaned up later in order to be readable.

This is the problem. This PR's scope has changed. I prefer smaller and simpler changes with a well defined scope.

If you can get all tests passing - including adding a recursive example and testing it - I'll be happy to look again and reconsider.

travisstaloch · 2023-06-13T04:09:49Z

src/codegen/writer.zig

+                const getter_name = try self.getFunctionName(name);
+                var setter_buf = std.ArrayList(u8).init(self.allocator);
+                defer setter_buf.deinit();
+                try setter_buf.appendSlice("set");


please keep the existing api with getters and setters named "Field" and "MutateField"

travisstaloch · 2023-06-13T04:13:05Z

src/codegen/writer.zig

+        var res = std.ArrayList(u8).init(self.allocator);
+        defer res.deinit();
+
+        try toCamelCase(res.writer(), name);


getters and setters should be PascalCase (upper camel case)

I think this belongs as a CLI flag. There's some debate on a Zig issue to make function names snake_case.

travisstaloch · 2023-06-14T02:07:32Z

src/codegen/codegen.zig

+    enum_: types.Enum,
+    object: types.Object,
+
+    const Self = @This();


In order to improve readability, please remove all occurrences of const Self = @This() (including generated code). Lets use the container names whenever possible instead of creating another binding. I know that its nice to be able to hard-code Self during generation. But we need to prioritize readability of the generated code over ease of generation.

But we need to prioritize readability of the generated code over ease of generation.

I find Self more readable. It's not a big deal either way, though...

travisstaloch · 2023-06-14T10:18:40Z

src/codegen/string_pool.zig

+const std = @import("std");
+
+const Allocator = std.mem.Allocator;
+pub const StringPool = struct {


why do we need to use string pool? the flatbuffers parser has already checked the field names for us.

To store intermediate strings like names with transformed cases or variable names. It's otherwise difficult to manage their lifetimes.

the reason i asked is because i tried to transition this project toward using more formatting objects like this: https://github.com/travisstaloch/flatbufferz/blob/main/src/codegen.zig#L80 . i found that they helped make some things less imperative / procedural. and while allocations aren't a big concern here, they also helped remove some of them too.

not sure if its possible with your impl, but maybe it can be useful for you.

Yeah I saw that. I made similar helpers for casing.

The string pool is because it's very convenient to store intermediate strings and leads to more readable writer.print("const {s} = struct {{");s. It's also necessary to store them somewhere to handle identifier collisions. A BufferedWriter is another more memory-efficient choice for that, but then you still have to store the views into its buffer somewhere.

clickingbuttons · 2023-06-15T17:40:32Z

I managed to get code generation for unpacking working for non-recursive tables. I even added array handling which seems to be missing.

However, after reading the official implementation I want to make my own fork that:

Treats tables as ?*u8 similar to the C++ char data[1] member. This saves space.
Treats structs as extern structs instead of as tables. This allows bitcasting.
Adds verification similar to the C++ implementation so that invalid offsets in flatbuffers don't cause segfaults.
Adds a slice API for vectors of non-tables.

Indeed, I am scope creeping to writing an entirely different library! I'm thankful for the work you've done and am willing to collaborate if you wish.

travisstaloch · 2023-06-15T18:52:49Z

I even added array handling which seems to be missing.

thanks for pointing this out. could you maybe open an issue with a small fbs repro?

I want to make my own fork that: ...

sounds cool! i really appreciate all the work and feedback you've given. these sound like interesting ideas. i think i'd like to add verification someday too.

wish you luck and will try to follow your progress. i welcome any feedback you have here such as ideas for improvement or missing features / bugs. don't hesitate to open issues. i'll try to do the same. lets collaborate in the future 👍

clickingbuttons · 2023-06-15T21:45:21Z

I've split up #3 into smaller higher quality issues and added #9.

clickingbuttons · 2023-07-07T21:22:40Z

@travisstaloch I'm 95% done with https://github.com/clickingbuttons/flatbuffers-zig. While I'm sure there are some bugs hiding, you're welcome to take ideas from there! It does solve most of my open issues here and offers a safe API that shouldn't be able to segfault.

Thanks for all your work, it was very helpful to know that I had to write a builder, then use flatc --schema to generate .bfbs files, then handwrite reflection readers, and finally create a Zig code generator that uses those readers.

travisstaloch · 2023-07-07T23:39:15Z

Oh cool! Thanks for the update. Nice work! Your generated code looks really good. And like that your implementation isn't mainly a port like mine. Its more from scratch / first principles and I find it very readable. Seems like you had a better understanding of flatbuffers than me when I started.

Glad that you found this project was useful 👍

clickingbuttons added 3 commits June 12, 2023 22:06

start codegen refactor

75e0f46

better codegen naming, struct field accessors, some cleanup

e771149

packing progress

b930675

clickingbuttons added 2 commits June 13, 2023 15:50

better packing, add index file

60cc43c

bugfixes

b959d3f

bugfixes. reading working except for arrays (was already broken, tho)

c29215a

travisstaloch requested changes Jun 15, 2023

View reviewed changes

clickingbuttons closed this Jun 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codegen refactor #4

Codegen refactor #4

clickingbuttons commented Jun 13, 2023

travisstaloch commented Jun 13, 2023 •

edited

Loading

Serious issues

clickingbuttons commented Jun 13, 2023

travisstaloch commented Jun 14, 2023 •

edited

Loading

travisstaloch Jun 13, 2023

travisstaloch Jun 13, 2023

clickingbuttons Jun 15, 2023

travisstaloch Jun 14, 2023

clickingbuttons Jun 15, 2023

travisstaloch Jun 14, 2023

clickingbuttons Jun 15, 2023

travisstaloch Jun 15, 2023

clickingbuttons Jun 15, 2023

clickingbuttons commented Jun 15, 2023

travisstaloch commented Jun 15, 2023

clickingbuttons commented Jun 15, 2023

clickingbuttons commented Jul 7, 2023

travisstaloch commented Jul 7, 2023

Codegen refactor #4

Codegen refactor #4

Conversation

clickingbuttons commented Jun 13, 2023

travisstaloch commented Jun 13, 2023 • edited Loading

Serious issues

clickingbuttons commented Jun 13, 2023

travisstaloch commented Jun 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clickingbuttons commented Jun 15, 2023

travisstaloch commented Jun 15, 2023

clickingbuttons commented Jun 15, 2023

clickingbuttons commented Jul 7, 2023

travisstaloch commented Jul 7, 2023

travisstaloch commented Jun 13, 2023 •

edited

Loading

travisstaloch commented Jun 14, 2023 •

edited

Loading