Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IDSelector memory leak with SWIG, add unit tests (#3704) #3810

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mnorris11
Copy link

Summary:

Background

Issue: #2996

Prior attempted fix: https://github.com/facebookresearch/faiss/pull/3007/files#diff-d704f33c46a4ef2936c8cf2a66b05f1993e25e79ee5c19d4b63c4e0cf46b0a42

Partially copied from #3139 with an additional unit test.

It is a confirmed and reproducible memory leak every time. There is a workaround. See the comments on #2996.

Relevant SWIG docs: https://www.swig.org/Doc4.1/SWIGDocumentation.html#Typemaps_nn2

Current status

buck test faiss/tests:test_search_params -- test_ownership_2

Test prints:

without SWIG fix:
[0.49, 0.82, 1.15, 1.39, 1.67, 1.88, 2.16, 2.36, 2.65, 2.85, 3.13, 3.34, 3.62, 3.82, 4.11, 4.31, 4.6, 4.8, 5.08, 5.28]

with SWIG fix:
[0.52, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71]

Note: This test is not ideal. Ideally we could enable ASAN to catch these memory leaks. But ASAN does not seem to catch these Python memory leaks. Example:

def test_ownership_3(self):
        d = 32
        quantizer = faiss.IndexFlatL2(d)
        quantizer.this.own(False)

The above test produces no ASAN error, even though the quantizer object leaks.

Differential Revision: D61992599

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61992599

mnorris11 pushed a commit to mnorris11/faiss that referenced this pull request Aug 29, 2024
…h#3810)

Summary:
Pull Request resolved: facebookresearch#3810

Pull Request resolved: facebookresearch#3704

Background
--
Issue: facebookresearch#2996

Prior attempted fix: https://github.com/facebookresearch/faiss/pull/3007/files#diff-d704f33c46a4ef2936c8cf2a66b05f1993e25e79ee5c19d4b63c4e0cf46b0a42

Partially copied from facebookresearch#3139 with an additional unit test.

It is a confirmed and reproducible memory leak every time. There is a workaround. See the comments on facebookresearch#2996.

Relevant SWIG docs: https://www.swig.org/Doc4.1/SWIGDocumentation.html#Typemaps_nn2

Current status
--

`buck test faiss/tests:test_search_params -- test_ownership_2`

Test prints:

without SWIG fix:
`[0.49, 0.82, 1.15, 1.39, 1.67, 1.88, 2.16, 2.36, 2.65, 2.85, 3.13, 3.34, 3.62, 3.82, 4.11, 4.31, 4.6, 4.8, 5.08, 5.28]`

with SWIG fix:
`[0.52, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71]`

Note: This test is not ideal. Ideally we could enable ASAN to catch these memory leaks. But ASAN does not seem to catch these Python memory leaks. Example:
```
def test_ownership_3(self):
        d = 32
        quantizer = faiss.IndexFlatL2(d)
        quantizer.this.own(False)
```
The above test produces no ASAN error, even though the `quantizer` object leaks.

Differential Revision: D61992599
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61992599

@mnorris11
Copy link
Author

Looking into errors.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61992599

mnorris11 pushed a commit to mnorris11/faiss that referenced this pull request Sep 12, 2024
…h#3810)

Summary:
Pull Request resolved: facebookresearch#3810

Pull Request resolved: facebookresearch#3704

Background
--
Issue: facebookresearch#2996

Prior attempted fix: https://github.com/facebookresearch/faiss/pull/3007/files#diff-d704f33c46a4ef2936c8cf2a66b05f1993e25e79ee5c19d4b63c4e0cf46b0a42

Partially copied from facebookresearch#3139 with an additional unit test.

It is a confirmed and reproducible memory leak every time. There is a workaround. See the comments on facebookresearch#2996.

Relevant SWIG docs: https://www.swig.org/Doc4.1/SWIGDocumentation.html#Typemaps_nn2

Current status
--

`buck test faiss/tests:test_search_params -- test_ownership_2`

Test prints:

without SWIG fix:
`[0.49, 0.82, 1.15, 1.39, 1.67, 1.88, 2.16, 2.36, 2.65, 2.85, 3.13, 3.34, 3.62, 3.82, 4.11, 4.31, 4.6, 4.8, 5.08, 5.28]`

with SWIG fix:
`[0.52, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71]`

Note: This test is not ideal. Ideally we could enable ASAN to catch these memory leaks. But ASAN does not seem to catch these Python memory leaks. Example:
```
def test_ownership_3(self):
        d = 32
        quantizer = faiss.IndexFlatL2(d)
        quantizer.this.own(False)
```
The above test produces no ASAN error, even though the `quantizer` object leaks.

Why change HNSW test?
--
This fix causes the HNSW test to fail with heap-use-after-free. This is because the index.storage.get_distance_computer() somehow gets freed during clone_index, but only when reassigning to the same variable like `index.storage = clone(index.storage)`. I checked in https://fburl.com/code/qw6fznjt, and it is non-null before returning on the CPP side.

After adding the temp variable, I also had to set `index.own_fields = False`, otherwise we get a heap-use-after-free again due to it being deleted already.

Differential Revision: D61992599
…h#3810)

Summary:
Pull Request resolved: facebookresearch#3810

Pull Request resolved: facebookresearch#3704

Background
--
Issue: facebookresearch#2996

Prior attempted fix: https://github.com/facebookresearch/faiss/pull/3007/files#diff-d704f33c46a4ef2936c8cf2a66b05f1993e25e79ee5c19d4b63c4e0cf46b0a42

Partially copied from facebookresearch#3139 with an additional unit test.

It is a confirmed and reproducible memory leak every time. There is a workaround. See the comments on facebookresearch#2996.

Relevant SWIG docs: https://www.swig.org/Doc4.1/SWIGDocumentation.html#Typemaps_nn2

Current status
--

`buck test faiss/tests:test_search_params -- test_ownership_2`

Test prints:

without SWIG fix:
`[0.49, 0.82, 1.15, 1.39, 1.67, 1.88, 2.16, 2.36, 2.65, 2.85, 3.13, 3.34, 3.62, 3.82, 4.11, 4.31, 4.6, 4.8, 5.08, 5.28]`

with SWIG fix:
`[0.52, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71]`

Note: This test is not ideal. Ideally we could enable ASAN to catch these memory leaks. But ASAN does not seem to catch these Python memory leaks. Example:
```
def test_ownership_3(self):
        d = 32
        quantizer = faiss.IndexFlatL2(d)
        quantizer.this.own(False)
```
The above test produces no ASAN error, even though the `quantizer` object leaks.

Why change HNSW test?
--
This fix causes the HNSW test to fail with heap-use-after-free. This is because the index.storage.get_distance_computer() somehow gets freed during clone_index, but only when reassigning to the same variable like `index.storage = clone(index.storage)`. I checked in https://fburl.com/code/qw6fznjt, and it is non-null before returning on the CPP side.

After adding the temp variable, I also had to set `index.own_fields = False`, otherwise we get a heap-use-after-free again due to it being deleted already.

Differential Revision: D61992599
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61992599

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

3 participants