Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check types boundaries #295

Open
achaikou opened this issue Sep 17, 2020 · 0 comments
Open

Check types boundaries #295

achaikou opened this issue Sep 17, 2020 · 0 comments

Comments

@achaikou
Copy link
Contributor

Right now dlisio doesn't do enough checks for data boundaries. This may cause segfaults and other memory issues as we might attempt to read past eligible memory.
While this situation is very unlikely to happen on normal file processing, it might become a little bit more likely with planned support of "read as many VRs as possible until error happens" as data might already be spoiled in the last "correctly" processed record.

dlisio is designed such that functions defined in types.h are supposed to be primitive, hence it's not their responsibility to check the boundaries. Boundaries must be assured by the caller which knows how much data there is left in the current record / there is until end of the file.

We have several situations:

  • fixed-sized data. As it is known how many bytes each type must contain (defined in specification), it's easy to assure boundaries from the outside
  • variable-sized "unbounded" data (meaning ascii). For that kind of data situation is more complicated. Solution might be to first read data length separately and then decide if value is compliant with remaining bytes.
  • variable-sized "bounded" data. We can deal with it similar way as with "unbounded" data, but the most prominent example (ident) happens to be a part of other types like obname and objref. So other solution might be preferable:
    We might do a post-check. To prevent bad read just create a buffer of maximum possible size, copy there the data and use it as source, thus making sure that we never read outside of memory. This approach however might be problematic for ascii data as its maximum possible length is around 1GB.
    Example pseudo-code for ident which can be of maximum size 256:
if (remaining < 256)
  make-buffer buf(256)
  copyto(buf, src, remaining)
  dlis_ident(buf as src...)
  assert remaining < ident_length

Issues that are part of this one or partly relate: #210, #205, #187

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant