-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dns: fix custom servers support and extend tests #320
Conversation
It looks like the dns fallback mechanism is broken. func (r *resolver) dialDNS(ctx context.Context, network, address string) (net.Conn, error) {
for _, u := range r.servers {
r.log.Debugf("dial DNS server %s instead of %s", u, address)
conn, err := r.dialer.DialContext(ctx, network, u.String())
if err != nil {
r.log.Errorf("failed to dial DNS server %s: %v", u, err)
continue
}
return conn, nil
}
return nil, fmt.Errorf("failed to dial DNS server")
} Custom dialDNS function always returns the connection to the first dns server, even if the server is wrong on purpose. The dial to UDP succeeds every time. The error is encountered later when dns query is performed.
|
bind/flag.go
Outdated
@@ -39,7 +39,8 @@ func DNSConfig(fs *pflag.FlagSet, cfg *forwarder.DNSConfig) { | |||
"If specified multiple times, the first one is used as primary server, the rest are used as fallbacks. "+ | |||
"The port is optional, if not specified the default port is 53. ") | |||
fs.DurationVar(&cfg.Timeout, | |||
"dns-timeout", cfg.Timeout, "Timeout for dialing DNS servers.") | |||
"dns-timeout", cfg.Timeout, "Timeout for dialing DNS servers. "+ | |||
"If specified multiple servers, the timeout for each lookup is equal to timeout divided by the number of servers.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be for each server.
2907a7f
to
c6e6f66
Compare
dial.go
Outdated
resolvers := make([]net.Resolver, 0, len(cfg.DNSServers)) | ||
for _, s := range cfg.DNSServers { | ||
dnsAddress := s.String() | ||
resolvers = append(resolvers, net.Resolver{ | ||
PreferGo: true, | ||
Dial: func(ctx context.Context, network, address string) (net.Conn, error) { | ||
l.Debugf("dial DNS server %s instead of %s", dnsAddress, address) | ||
return nd.DialContext(ctx, network, dnsAddress) | ||
}, | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor to function, if no DNS servers specified return nil
dial.go
Outdated
lctx, cancel := context.WithTimeout(ctx, d.cfg.ResolveTimeout) | ||
ips, err := d.resolvers[i].LookupIP(lctx, network, hostname) | ||
cancel() | ||
if err != nil { | ||
d.log.Debugf("failed to resolve %s on %s: %s", hostname, d.cfg.DNSServers[i], err) | ||
} else { | ||
return ips, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor to method
dial.go
Outdated
return nil, err | ||
} | ||
|
||
ips, err := d.LookupIP(ctx, "ip", host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
network is specified in the DNS server entry, this parameter should be removed from LookupIP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http RoundTripper uses tcp
network, when resolver.LookupIP needs ip
, ipv4
or ipv6
dial.go
Outdated
return nil, fmt.Errorf("failed to dial %s on any IP address", address) | ||
} | ||
|
||
func (d *Dialer) DialFunc() dialFunc { //nolint:revive,golint // Unexported by design. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the linter. It looks odd and has no purpose really. The Dialer should implement DialContext.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
dial_test.go
Outdated
) | ||
|
||
func TestDialerLookupHost(t *testing.T) { | ||
if _, ok := os.LookupEnv("CI"); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should run in in CI. We have tests talking to google.com so a DNS query to CF won't hurt.
dial.go
Outdated
resolvers = append(resolvers, net.Resolver{ | ||
PreferGo: true, | ||
Dial: func(ctx context.Context, network, address string) (net.Conn, error) { | ||
l.Debugf("dial DNS server %s instead of %s", dnsAddress, address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I have 5 dns servers?
Would it dial each DNSServer 5 times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would sequentially dial the 5 servers, one after another, unless a lookup succeeds. The net.Dialer dials two times by default.
dial_test.go
Outdated
@@ -16,7 +16,7 @@ import ( | |||
"github.com/saucelabs/forwarder/log/stdlog" | |||
) | |||
|
|||
func TestResolverLookupHost(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see a test for context cancelation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
I'm still not really convinced that using the resolver hack is better. For instance https://github.com/rs/dnstrace provides you with more robust client supporting recursive queries (with CNAME support). The full list is here maybe you can find something even better https://pkg.go.dev/github.com/miekg/dns?tab=importedby. |
I agree, the resolver hack has its disadvantages, but I would not give up this idea. We can try with the client and compare the results. |
3c6abaf
to
6780733
Compare
We abandoned the custom dialer with resolvers idea in favour of using Golang's net package internals. The internals are hidden from user. To overcome it, we link the private function and manually configure dnsConfig. This way we can manipulate the behaviour of internal Golang's Resolver. |
dns.go
Outdated
|
||
"github.com/saucelabs/forwarder/log" | ||
) | ||
|
||
// dnsConfig is 1:1 copy of net.dnsConfig struct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put it to a separate file then.
We are not capable of asserting the hostname is resolved at the right dns server. The tests will be rewritten to e2e suite where we have more powerfull tools to perform them.
bind/flag.go
Outdated
DNSConfig(fs, &cfg.DNSConfig) | ||
} | ||
|
||
func HTTPTransportConfig(fs *pflag.FlagSet, cfg *forwarder.HTTPTransportConfig) { | ||
DialConfig(fs, &cfg.DialConfig, "http") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function order.
7fa3e8b
to
a65d8a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those can be squashed.
e2e(dns server setup): increase http dial timeout for proxy Hubert Grochowski 04/08/2023, 09:37
e2e/forwarder(service): add withHTTPDialTimeout method Hubert Grochowski 04/08/2023, 09:35
Overall very well done, work nicely split to commits.
I left some minor comments inline.
LGTM
dial.go
Outdated
KeepAlive time.Duration | ||
} | ||
|
||
func DefaultDialConfig() DialConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we return a ptr here?
dial.go
Outdated
nd *net.Dialer | ||
} | ||
|
||
func NewDialer(cfg DialConfig) (*Dialer, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here pass a ptr, Dialer keeps a copy like it does now.
utils/osdns/dnsconf.go
Outdated
@@ -0,0 +1,51 @@ | |||
// Copyright 2023 Sauce Labs Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this file should also have the Go license in it.
It would be also good to point to the file it's based on.
utils/osdns/resolvconf.go
Outdated
@@ -0,0 +1,41 @@ | |||
// Copyright 2023 Sauce Labs Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as in dnsconf.go
bind/flag.go
Outdated
fs.VarP(anyflag.NewSliceValue[netip.AddrPort](nil, &cfg.Servers, forwarder.ParseDNSAddress), | ||
"dns-server", "n", "<ip>[:<port>]"+ | ||
"DNS server(s) to use instead of system default. "+ | ||
"If specified multiple times, the first one is used as primary server, the rest are used as fallbacks. "+ | ||
"If specified multiple times, the first one is used as primary server, the rest are used as fallbacks, "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to add a note on --dns-round-robin
bind/flag.go
Outdated
"Only used if DNS servers are specified. ") | ||
|
||
fs.BoolVar(&cfg.RoundRobin, "dns-round-robin", cfg.RoundRobin, "Enable round-robin DNS server selection. "+ | ||
"Won't have any effect if DNS servers are not specified. ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is adds no value. Add something more relevant
If more than one DNS server is specified with the --dns-server flag, passing this flag ....
cmd/forwarder/pac/server/server.go
Outdated
@@ -36,7 +38,14 @@ func (c *command) RunE(cmd *cobra.Command, args []string) error { | |||
logger := stdlog.New(c.logConfig) | |||
logger.Debugf("configuration\n%s", config) | |||
|
|||
t, err := forwarder.NewHTTPTransport(c.httpTransportConfig, nil) | |||
if len(c.dnsConfig.Servers) > 0 { | |||
logger.Named("dns").Infof("using DNS servers %v", c.dnsConfig.Servers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nicer using strings.Join(c.dnsConfig.Servers, ", ",)
e2e/setups.go
Outdated
networkName = "forwarder-e2e_default" | ||
httpbinIPAddr = "192.168.100.10" | ||
proxyIPAddr = "192.168.100.11" | ||
dnsIPAddr = "192.168.100.13" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nicer if DNS came first with lower IPs.
This patch separates the dialer to its custom wrapper. This not only introduces a good structure to the package, but also is nice foundation to #118.
Use net package internal functions to configure custom dns options.
e2e/forwarder(service): add withHTTPDialTimeout method Default value for HTTPDialTiemout is 10s. Default value for DNSTimeout is 5s. When test queried invalid address on two dns servers it would result in nondeterministic return code, as sometimes the dial timed out and sometimes the query failed.
Forwarder uses go:linkname directive. It may lead to unsafe behavior if the linked code changes. This workflow performs a primitive check by calculating the checksums of linked modules. If the checksums do not match, a user is forced to manually check the changes and update the checksums. The action runs on '.version' file changes.
@mmatczuk I've addressed your comments. Merge when you are ready. |
Increase health check start period and remove interval timeout as the test always succeeds.
@mmatczuk I also tweaked sc-2450 setup health check values as the server got frozen in the previous run. |
No description provided.