patch_parse: fix out-of-bounds reads caused by integer underflow#5312
Merged
ethomson merged 1 commit intolibgit2:masterfrom Dec 1, 2019
Merged
patch_parse: fix out-of-bounds reads caused by integer underflow#5312ethomson merged 1 commit intolibgit2:masterfrom
ethomson merged 1 commit intolibgit2:masterfrom
Conversation
The patch format for binary files is a simple Base85 encoding with a
length byte as prefix that encodes the current line's length. For each
line, we thus check whether the line's actual length matches its
expected length in order to not faultily apply a truncated patch. This
also acts as a check to verify that we're not reading outside of the
line's string:
if (encoded_len > ctx->parse_ctx.line_len - 1) {
error = git_parse_err(...);
goto done;
}
There is the possibility for an integer underflow, though. Given a line
with a single prefix byte, only, `line_len` will be zero when reaching
this check. As a result, subtracting one from that will result in an
integer underflow, causing us to assume that there's a wealth of bytes
available later on. Naturally, this may result in an out-of-bounds read.
Fix the issue by checking both `encoded_len` and `line_len` for a
non-zero value. The binary format doesn't make use of zero-length lines
anyway, so we need to know that there are both encoded bytes and
remaining characters available at all.
This patch also adds a test that works based on the last error message.
Checking error messages is usually too tightly coupled, but in fact
parsing the patch failed even before the change. Thus the only
possibility is to use e.g. Valgrind, but that'd result in us not
catching issues when run without Valgrind. As a result, using the error
message is considered a viable tradeoff as we know that we didn't start
decoding Base85 in the first place.
tiennou
approved these changes
Nov 28, 2019
Contributor
tiennou
left a comment
There was a problem hiding this comment.
LGTM. I only have a minor suggestion, so feel free to merge as is if that's too much. Thanks for the fix !
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The patch format for binary files is a simple Base85 encoding with a
length byte as prefix that encodes the current line's length. For each
line, we thus check whether the line's actual length matches its
expected length in order to not faultily apply a truncated patch. This
also acts as a check to verify that we're not reading outside of the
line's string:
There is the possibility for an integer underflow, though. Given a line
with a single prefix byte, only,
line_lenwill be zero when reachingthis check. As a result, subtracting one from that will result in an
integer underflow, causing us to assume that there's a wealth of bytes
available later on. Naturally, this may result in an out-of-bounds read.
Fix the issue by checking both
encoded_lenandline_lenfor anon-zero value. The binary format doesn't make use of zero-length lines
anyway, so we need to know that there are both encoded bytes and
remaining characters available at all.
This patch also adds a test that works based on the last error message.
Checking error messages is usually too tightly coupled, but in fact
parsing the patch failed even before the change. Thus the only
possibility is to use e.g. Valgrind, but that'd result in us not
catching issues when run without Valgrind. As a result, using the error
message is considered a viable tradeoff as we know that we didn't start
decoding Base85 in the first place.