Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: database.stash and exposing a filesystem-like api in an IDA database #8

Open
arizvisa opened this issue Sep 23, 2018 · 4 comments

Comments

@arizvisa
Copy link
Owner

This was experimented with during "toolbag" development like forever ago, but never made it into ida-minsc due to the vastly different intentions and mantras between both plugins. In essence, a filesystem-like api exposed via something like database.stash would be very useful for allowing a user to bundle arbitrarily-sized data with their IDA database. This would facilitate higher-level plugin development which needs to cache extra data in the database without storing it externally via the platform's regular filesystem.

The "blob" type for a netnode in an IDA database has some size limitations that need to be abstracted around in order to provide this capability. As a result, to support arbitrarily sized data, we could use a linked list or a tree for searching a file's different chunks. A better way would be to actually implement an embedded filesystem using "blob" types. Actually, we can probably implement something similar to a FAT-based filesystem (or some another filesystem) using blobs as its primitive storage mechanism.

The lower-level components could then be implemented as internal.netnode.nodefs, at which point some higher-level interface could be exposed via the database module. If we have some kind of filesystem like this, then we could begin to consider arbitrarily sized tags that have zero size limitations. We could also modify the semantics of tagging a bit so that anything that's double-underscored (Python name mangling) would result in a hidden tag that is physically stored in the filesystem. This would allow serialization of types other than the basic Python ones that we presently encode within comments.

@arizvisa
Copy link
Owner Author

Some related links on ideas for data structures that can be used for this. I'm just closing out some old tabs, so I'll leave these here so they aren't lost.

@arizvisa
Copy link
Owner Author

It looks like we can maybe rip https://github.com/williballenthin/ida-netnode/blob/master/netnode/netnode.py... I should probably ask him if it's okay someday.

@arizvisa
Copy link
Owner Author

arizvisa commented Oct 8, 2020

PR #70 was created to perform the research needed to implement this feature.

@arizvisa
Copy link
Owner Author

Note to self: make sure the filesystem is versioned in some way. Since the first iteration is probably going to be an allocation table, we might want to upgrade this to a balanced tree at some point because defragmentation sucks. This way we'll be able to introduce breaking changes in the future to the underlying structure and can provide the option of either using the older implementation, or provide a way of converting the older data structure to the new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant