Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing the C API #279

Merged
merged 3 commits into from
Nov 26, 2023
Merged

Introducing the C API #279

merged 3 commits into from
Nov 26, 2023

Conversation

Kiyoshika
Copy link
Member

@Kiyoshika Kiyoshika commented Nov 25, 2023

This introduces the first set of C bindings for gosub which, at the moment, just exposes the render tree that we can use in Swift to render the text on the screen.

Ideally, we will build out an "engine API" and wrap that so the user agents can use, instead of using gosub-internal data structures, but right now this builds the foundation for the C API.

EDIT: unsafe code has been completely moved out of the core engine and instead a wrapper has been introduced in the bindings to convert between engine structs and C-friendly structs.
I'm trying to keep as much unsafe code out of the engine as possible, but in some cases this is unavoidable to keep the structs compatible with the C interface. We may be able to move this stuff out of the engine at some point.

Currently the only known problem with the C API is that we are leaking memory from the DocumentHandle held by the RenderTree. I'm currently investigating this but unfortunately can't find the root cause after trying various different ways to "free" this memory. We may have to use something other than a DocumentHandle to get around it... not sure yet

After this is merged, I'd like to write some docs on the gosub wiki / github wiki about how to use the C API "safely" which will also be useful to others building native clients (the main consumers of the API)

src/render_tree.rs Outdated Show resolved Hide resolved
@@ -152,7 +172,7 @@ pub struct Node {
impl Node {
pub fn new() -> Self {
Self {
node_type: NodeType::Root,
node_type: NodeType::Root(true),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a dummy value that needs to be included otherwise it seems like it's ignored going from Rust -> C

src/render_tree.rs Outdated Show resolved Hide resolved
@@ -330,107 +349,4 @@ mod tests {
panic!()
}
}

#[test]
fn text_nodes() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is moved to gosub-bindings/tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you want to have this tests only over the C API

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I was on the fence about this... my thought was the main purpose of this portion of the test was more relevant for the C API, and it might be good to have both tests but then I'd have to maintain two versions of the same test, and I'd rather have the C API be right (which would also imply the render tree implementation is correct.) But I can see the argument for having it in the render tree itself as well

src/render_tree/text.rs Outdated Show resolved Hide resolved
src/render_tree/text.rs Outdated Show resolved Hide resolved
@Kiyoshika Kiyoshika requested a review from a team November 25, 2023 01:39
Copy link
Member

@Sharktheone Sharktheone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the usage of mut ptrs in the engine itself. I think, there must be a other alternative to deal with this. Maybe you can add other struct variants for example CTextNode and convert it to this. But I'm not sure, what the performance would say about this.

gosub-bindings/src/lib.rs Outdated Show resolved Hide resolved
///
/// Moves an owning pointer to the render_tree using Box::into_raw() to the C API.
/// This pointer MUST be passed to gosub_render_tree_free() after usage for proper cleanup.
pub unsafe extern "C" fn gosub_render_tree_init(html: *const c_char) -> *mut RenderTree {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, this is okay, but i guess we want this most likely to be a URL and not the HTML document, so we can use the browser's dns system

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would want to move to the DNS system eventually, but was not available at the time I wrote this

Copy link
Collaborator

@emwalker emwalker Nov 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed that the Swift / Objective C side will never free this *const c_char pointer while we're in Rust code?

chars.set_confidence(Confidence::Certain);

let doc = DocumentBuilder::new_document();
let parse_result = Html5Parser::parse_document(&mut chars, Document::clone(&doc), None);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed here, we maybe should not return a Result with a Vec<ParseError> as the Ok type. Maybe i missed something here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, @jaytaph would have more insight to that, I am just a consumer of the parser :P

/// This takes ownership of the pointer from the C API and transfers it to Rust so it can
/// be deallocated.
pub unsafe extern "C" fn gosub_render_tree_iterator_free(tree_iterator: *mut TreeIterator) {
let _ = Box::from_raw(tree_iterator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested using drop_in_place here? At leased you didn't wrote something in Zulip.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did but it segfaulted. There's something odd about the document, I noted it down for myself for future research but could not figure it out yet.

I might move away from DocumentHandle and try Box-ing the Document itself to see if that would work

src/render_tree.rs Outdated Show resolved Hide resolved
src/render_tree/text.rs Outdated Show resolved Hide resolved
@Kiyoshika Kiyoshika marked this pull request as draft November 25, 2023 15:00
@Kiyoshika
Copy link
Member Author

Kiyoshika commented Nov 25, 2023

Converting to draft for the following changes:

  • Introduce different struct to avoid raw pointers and unsafe code in the engine (and use String again). Convert to this new struct in the bindings that will convert the String to a raw pointer and transfer ownership to the C API
  • Change f32 back to f64 and update C API to use double instead of float
  • move the #[no_mangle] tags down to the actual function instead of above the comments

TODO in the future (not in this PR, can create issues/tasks for these)

  • switch to @jaytaph's DNS system
  • investigate the memory leak caused by DocumentHandle, consider moving to different pointer type and using Document directly

@Kiyoshika Kiyoshika marked this pull request as ready for review November 25, 2023 18:13
@Kiyoshika Kiyoshika requested a review from a team November 25, 2023 18:20
Copy link
Member

@Sharktheone Sharktheone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things, but at most of them feel free to ignore.
Note that i cannot say too much about the C side.

gosub-bindings/src/gosub_api.c Outdated Show resolved Hide resolved
gosub-bindings/src/lib.rs Outdated Show resolved Hide resolved
gosub-bindings/src/lib.rs Outdated Show resolved Hide resolved
src/render_tree.rs Outdated Show resolved Hide resolved
@@ -330,107 +349,4 @@ mod tests {
panic!()
}
}

#[test]
fn text_nodes() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you want to have this tests only over the C API

@Kiyoshika
Copy link
Member Author

@Sharktheone added your suggestions and also switched the render_tree_init to use a stack-allocated struct instead of allocating one on the heap

Copy link
Member

@Sharktheone Sharktheone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a minor thing, feel free to ignore.

Comment on lines 21 to 23
/// Initialize a render tree by passing a stack-allocated
/// struct by address.
void render_tree_init(struct render_tree_t *render_tree, const char *html);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know too much about this, but shouldn't it be possible to create the render_tree_t in the function? I've seen this approach in many different C Libraries, so it's probably the best solution, when we want to have it on the stack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't return a pointer to something stack-allocated in a function, C doesn't transfer ownership. If you tried dereferencing it, you would segfault

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless they are doing something that I've never heard of before... I feel like that may be possible in C++ with its move semantics, but I haven't really seen anything like that in C

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you mean returning the struct by value, I guess I didn't really think of that, but not something I do often since it seems kind of weird to copy the value around (even tho the compiler might do some optimizations on return values)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, after some thought, just returning the struct by value might be fine in this case, it's very small anyways

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, okay. I don't have much C experience. I mostly used C++ and not raw C.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at my personal github, I came from mostly raw C... but usually the structs I work with are relatively large so returning by value didn't come across my mind since I am so used to returning pointers

Copy link
Member

@Sharktheone Sharktheone Nov 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, after some thought, just returning the struct by value might be fine in this case, it's very small anyways

I guess, you can just leave it the way it currently is, but if you want to change it just do it. I don't want to make you too much work. 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will leave it as a parameter and return the int to check if a failure occurred. Then I think this will be ready..

gosub-bindings/src/gosub_api.c Outdated Show resolved Hide resolved
Copy link
Member

@Sharktheone Sharktheone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@emwalker emwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fully out of my depth. I have only dabbled in C over the years, so I have no way to approve this PR beyond the impression that you and @Sharktheone seem to know C a lot more than I do. Godspeed.

./$(TEST_SRC_DIR)/render_tree_test

format:
cargo clippy --fix --allow-dirty --allow-staged
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having trouble building the bindings and test targets on my Ubuntu laptop. Is there some setup that I need to do to get started?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should only have to cd gosub-bindings, cargo build then run the make files

Copy link
Collaborator

@emwalker emwalker Nov 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Yes, that works.

2023-11-25_20-19

I see that that leaves a target directory under gosub-bindings. It seems that we're now getting into the realm of a Cargo workspace, which is what I'm thinking we should switch to at some point soon if you want to keep this crate separate from the main one in this repo. (Workspaces allow you to publish a lot of crates under a single repo.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought about this a while ago, but we only had one crate at that time.

Copy link
Member Author

@Kiyoshika Kiyoshika Nov 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I've been thinking of if it would be beneficial to move this out of the engine (which currently lists it as a dependency if you look at the .toml) but I wasn't sure what to do.

From the user agent perspective, it would only need to build the bindings (which would build the engine anyways since it's a dependency)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be as easy as this directory structure:

  • gosub-engine/
    • engine/
      • Cargo.toml
    • bindings/
      • Cargo.toml
    • Cargo.toml (for all the crates)
$ cat Cargo.toml
[workspace]

members = [
    "engine",
    "bindings",
]

///
/// Moves an owning pointer to the render_tree using Box::into_raw() to the C API.
/// This pointer MUST be passed to gosub_render_tree_free() after usage for proper cleanup.
pub unsafe extern "C" fn gosub_render_tree_init(html: *const c_char) -> *mut RenderTree {
Copy link
Collaborator

@emwalker emwalker Nov 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed that the Swift / Objective C side will never free this *const c_char pointer while we're in Rust code?

@Kiyoshika
Copy link
Member Author

@emwalker for some reason I can't reply to your comment

Are we guaranteed that the Swift Objective C side will never free this *const c_char pointer while we're in Rust code?

When this function is called from swift and the pointer is passed, only the Rust code (technically the C wrapper) will be doing work until after the scope closes and Swift continues, so we wouldn't be able to accidentially free this while it's being used by Rust/C

@Kiyoshika
Copy link
Member Author

Relevant swift portions if you're curious:

Calling render tree constructor (swift wrapper) - https://github.com/gosub-browser/gosub-client-macos/blob/110d52daaebe09154b2e422cbf2f63d33be51d9d/gosub-client-macos/ContentView.swift#L18C14-L18C14

The internals that call the C code - https://github.com/gosub-browser/gosub-client-macos/blob/110d52daaebe09154b2e422cbf2f63d33be51d9d/gosub-client-macos/RenderTree.swift#L8C14-L8C14

@Sharktheone
Copy link
Member

I'm fully out of my depth. I have only dabbled in C over the years, so I have no way to approve this PR beyond the impression that you and @Sharktheone seem to know C a lot more than I do. Godspeed.

I also don't have too much knowledge about C. My main understanding comes from C++ and only minimal C. But I've read some libraries that are coded in C and have a bit of understanding from C because of that, but also not too much.

@Kiyoshika
Copy link
Member Author

I've written quite a lot of C (as that makes up almost my entire github history) but none of it was professional and almost all of it was entirely solo that I've learned over time. But I've never really used it in the context of interacting with other languages (except one case with Python and ctypes for a numpy clone I wrote forever ago)

@Kiyoshika Kiyoshika merged commit c81302c into main Nov 26, 2023
4 checks passed
@Kiyoshika Kiyoshika deleted the feat/ffi-bindings branch November 26, 2023 03:41
@emwalker
Copy link
Collaborator

emwalker commented Nov 26, 2023

This talk by a Mozilla developer that wrote a Rust crate that is used in Firefox goes into some challenges he ran into with dropping pointers in a C++/Rust FFI integration that vaguely sound a little like the problem you were running into:

https://media.ccc.de/v/rustfest18-5-a_rust_crate_that_also_quacks_like_a_modern_c_library#t=1480

Might be a red herring.

@Kiyoshika
Copy link
Member Author

My next step is to go down this rabbit hole and try to figure out how to clean up the leak. I think using Rc<> will present a lot of challenges with cyclic reference counts that can lead to leaks, but not sure how that's happening in the specific case of the render tree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants