Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS service failure should not cause a router restart #1513

Closed
mikegorman-nf opened this issue Nov 15, 2023 · 4 comments · Fixed by #1727 or #1767
Closed

DNS service failure should not cause a router restart #1513

mikegorman-nf opened this issue Nov 15, 2023 · 4 comments · Fixed by #1727 or #1767

Comments

@mikegorman-nf
Copy link
Member

Currently, if the DNS injection test fails, the router process will restart. The system should log messages regarding the failure, but continue if possible. This will allow connectivity to be preserved to access the system and attempt to remedy the issue. A process restart is very unlikely to change the failed state.

plorenz added a commit that referenced this issue Nov 27, 2023
Don't terminate router on DNS failure. Fixes #1513
@dariuszSki
Copy link
Contributor

dariuszSki commented Feb 9, 2024

@plorenz I found the bug related to this fix. Here are the details. The resolver test fails, and the router does not error out. But when I try to reach the service and initiate tcp accept, it causes a sig fault.

[   2.166]    INFO ziti/tunnel/dns.NewDnsServer: dns server running at 127.0.0.1:53
[   2.166]    INFO ziti/tunnel/dns.(*resolver).AddHostname: adding ziti-tunnel.resolver.test = 19.65.28.94 to resolver                                                                                                     
[   2.206]   ERROR ziti/router/xgress_edge_tunnel.(*tunneler).Start: {error=[system resolver test failed: failed to resolve ziti-tunnel.resolver.test: lookup ziti-tunnel.resolver.test on 192.168.4.1:53: no such host
ziti-tunnel runs an internal DNS server which must be first in the host's resolver configuration. On systems that use NetManager/dhclient, this can  be achieved by adding the following to /etc/dhcp/dhclient.conf:                                            prepend domain-name-servers 127.0.0.1:53; ]} failed to start DNS resolver
[   2.749]    INFO ziti/tunnel/intercept.SetDnsInterceptIpRange: dns intercept IP range: 100.64.0.1 - 100.127.255.254                                                                                   
...
[  29.935]    INFO ziti/tunnel/intercept/tproxy.(*tProxy).acceptTCP: received connection: 10.20.96.100:98 -->10.20.96.0:45404                                                                                           panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x30e7019]
goroutine 227 [running]:
github.com/openziti/ziti/tunnel/intercept/tproxy.(*tProxy).acceptTCP(0xc00106ab40)  
   github.com/openziti/ziti/tunnel/intercept/tproxy/tproxy_linux.go:300 +0x2d9
created by github.com/openziti/ziti/tunnel/intercept/tproxy.(*interceptor).newTproxy in goroutine 150       
    github.com/openziti/ziti/tunnel/intercept/tproxy/tproxy_linux.go:238 +0x6ce

This is what is found at the line 300 in tproxy_linux.go
dstHostname, _ := self.resolver.Lookup(client.LocalAddr().(*net.TCPAddr).IP)

I think the resolver is not initialized and the memory location is invalid during this lookup. I may be wrong here. But we need to initialize the resolver even though the test fails.

@plorenz
Copy link
Member

plorenz commented Feb 9, 2024

Reopening based on comment from @dariuszSki

@plorenz plorenz reopened this Feb 9, 2024
@scareything scareything linked a pull request Feb 14, 2024 that will close this issue
@dariuszSki
Copy link
Contributor

@plorenz @scareything why couldn't we initialize the regular resolver instead of dummy? I am trying to understand the reason.

@scareything
Copy link
Member

@dariuszSki I think that's doable and makes a lot of sense. We should be able to always start the internal DNS server even if the self-test fails. I'll take another crack at fixing this issue without giving up the DNS server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants