Skip to content

tls: unconsume stream on destroy#17478

Closed
addaleax wants to merge 3 commits intonodejs:masterfrom
addaleax:tls-17475-maybefix
Closed

tls: unconsume stream on destroy#17478
addaleax wants to merge 3 commits intonodejs:masterfrom
addaleax:tls-17475-maybefix

Conversation

@addaleax
Copy link
Member

@addaleax addaleax commented Dec 6, 2017

When the TLS stream is destroyed for whatever reason,
we should unset all callbacks on the underlying transport
stream.

Fixes Refs (maybe Fixes): #17475

@indutny Could you take a look at this?

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

tls

@addaleax addaleax added the wip Issues and PRs that are still a work in progress. label Dec 6, 2017
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. tls Issues and PRs related to the tls subsystem. labels Dec 6, 2017
Copy link
Member

@indutny indutny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be a safe change to land, however I'm a bit worried that there is no test for this yet.

LGTM, though.

src/tls_wrap.cc Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change it to early return.

@addaleax addaleax removed the wip Issues and PRs that are still a work in progress. label Dec 7, 2017
@addaleax
Copy link
Member Author

addaleax commented Dec 7, 2017

@indutny addressed your comment & added a test

CI: https://ci.nodejs.org/job/node-test-commit/14651/

@addaleax addaleax added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Dec 7, 2017
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a unused var. As far as I see it you only wanted to deactivate it for clientTLSHandle. So you could just use // eslint-disable-next-line no-unused-vars?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@BridgeAR
Copy link
Member

I think it would be good to get another pair of eyes on this before landing due to the changed code. Therefore I am removing the ready label in the meanwhile.

@BridgeAR BridgeAR removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Dec 12, 2017
When the TLS stream is destroyed for whatever reason,
we should unset all callbacks on the underlying transport
stream.

Fixes: nodejs#17475
@addaleax
Copy link
Member Author

@nodejs/collaborators Could somebody review (at least) the test here, since that has been kind of implicitly been requested by @BridgeAR ?

Copy link
Contributor

@hybrist hybrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test file LGTM if my assumptions are correct.

'use strict';

// Regression test for https://github.com/nodejs/node/issues/17475
// Unfortunately, this tests only "works" reliably when checked with valgrind or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity: I assume "works" in this context means "doesn't raise memory leak errors"..? And properly running the test is just using the --valgrind option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkrems Yeah – it complains about accessing released memory when run with valgrind. “Unfortunately”, most of the time that doesn’t lead to crashes…

Running the test file with --valgrind should make that reflected in the exit code, yes :)

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

#ifdef SSL_CTRL_SET_TLSEXT_SERVERNAME_CB
sni_context_.Reset();
#endif // SSL_CTRL_SET_TLSEXT_SERVERNAME_CB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a comment before this block linking to the issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added a reference to the test and a quick explanation, the test contains a link to the issue.

@addaleax addaleax added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Dec 13, 2017
@addaleax
Copy link
Member Author

Landed in f96a86c

@addaleax addaleax closed this Dec 13, 2017
@addaleax addaleax deleted the tls-17475-maybefix branch December 13, 2017 05:47
@addaleax addaleax removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Dec 13, 2017
@gibfahn
Copy link
Member

gibfahn commented Dec 20, 2017

Normally we don't backport to LTS until things have been in a current release (9.x) for at least two weeks. This hasn't gone into a 9.x release yet, so we'd be talking 6 weeks before this could land in v8.x.

If this is a serious bug we should consider including it in the v8.9.4 release candidate build going out today, so that it goes out with v8.9.4 in two weeks.

What would be helpful is an idea of:

  1. How much pain this bug causes (if it's causing a hard crash repeatedly I'd assume a lot!)
  2. How many people are going to hit this
  3. How minor the fix is (how much risk is there in rushing it)

@mcollina
Copy link
Member

I don't think this needs to be rushed. This problem has likely be in for a very long time, and we just heard about it recently.

@qubyte
Copy link
Contributor

qubyte commented Dec 20, 2017

Hitting this bug should be pretty rare. I'm seeing it, but my use case is very similar to the code in #17475 (remarkably, I encountered it about a day later). It caused a whole lot of pain, but I'm happily using a recent build which includes the fixes, and will move over to the next release of 9 when it appears. The fix itself appears fairly low risk.

On balance, I would probably not do anything different to the usual process. The bug has been around for a long time, and I know of only two times it has been encountered.

@odinho
Copy link

odinho commented Dec 20, 2017

I don't disagree with anything above. But I want to point out that we've also seen this bug a long time, but recently (as in the last 2-3 weeks) it's happening a lot more often. We had some trouble nailing it back when it was a sporadic crash (and also sporadic failure in our unittests).

Due to the nature of the bug, that could be due to a multitude of reasons. And this increase in frequency might be due to some of our code or change in our usage.

I wanted to propose it at least.

@MylesBorins
Copy link
Contributor

+1 on waiting until the next release

@MylesBorins MylesBorins mentioned this pull request Jan 10, 2018
@MylesBorins
Copy link
Contributor

I've landed this on both v6.x and v8.x staging. Please lmk if there are objections

@addaleax
Copy link
Member Author

@MylesBorins Given the kind of issue that this fixes (rare, hard-to-debug segfault), I would even like to see it on v4.x. Would that be okay with you?

@MylesBorins
Copy link
Contributor

@addaleax there were no plans for another 4.x but I'm open to considering if you think this is a big enough deal. Would you be willing to open an issue on release to discuss?

@MylesBorins MylesBorins mentioned this pull request Jan 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. tls Issues and PRs related to the tls subsystem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.