Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns: fix custom servers support and extend tests #320

Merged
merged 10 commits into from
Aug 8, 2023
Merged

dns: fix custom servers support and extend tests #320

merged 10 commits into from
Aug 8, 2023

Conversation

Choraden
Copy link
Contributor

No description provided.

@saucelabs saucelabs deleted a comment from what-the-diff bot Jul 28, 2023
@Choraden
Copy link
Contributor Author

Choraden commented Jul 28, 2023

It looks like the dns fallback mechanism is broken.

func (r *resolver) dialDNS(ctx context.Context, network, address string) (net.Conn, error) {
	for _, u := range r.servers {
		r.log.Debugf("dial DNS server %s instead of %s", u, address)
		conn, err := r.dialer.DialContext(ctx, network, u.String())
		if err != nil {
			r.log.Errorf("failed to dial DNS server %s: %v", u, err)
			continue
		}
		return conn, nil
	}

	return nil, fmt.Errorf("failed to dial DNS server")
}

Custom dialDNS function always returns the connection to the first dns server, even if the server is wrong on purpose. The dial to UDP succeeds every time.

The error is encountered later when dns query is performed.

2023/07/28 13:16:34 INFO: using DNS servers: [127.0.0.1:34567 1.1.1.1:53]
--- FAIL: TestResolverLookupHostFallback (0.00s)
    dns_test.go:73: LookupHost failed: lookup google.com on 10.255.3.12:53: read udp 127.0.0.1:61671->127.0.0.1:34567: read: connection refused
    dns_test.go:76: LookupHost returned no address

@Choraden Choraden changed the title dns: add test on fallback mechanism for custom dns servers dns: rework resolver and add tests Jul 31, 2023
bind/flag.go Outdated
@@ -39,7 +39,8 @@ func DNSConfig(fs *pflag.FlagSet, cfg *forwarder.DNSConfig) {
"If specified multiple times, the first one is used as primary server, the rest are used as fallbacks. "+
"The port is optional, if not specified the default port is 53. ")
fs.DurationVar(&cfg.Timeout,
"dns-timeout", cfg.Timeout, "Timeout for dialing DNS servers.")
"dns-timeout", cfg.Timeout, "Timeout for dialing DNS servers. "+
"If specified multiple servers, the timeout for each lookup is equal to timeout divided by the number of servers.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be for each server.

@Choraden Choraden force-pushed the hg/dns branch 3 times, most recently from 2907a7f to c6e6f66 Compare August 1, 2023 15:47
@Choraden Choraden mentioned this pull request Aug 2, 2023
@Choraden Choraden marked this pull request as ready for review August 2, 2023 12:26
@Choraden Choraden requested a review from mmatczuk August 2, 2023 12:27
dial.go Outdated
Comment on lines 86 to 53
resolvers := make([]net.Resolver, 0, len(cfg.DNSServers))
for _, s := range cfg.DNSServers {
dnsAddress := s.String()
resolvers = append(resolvers, net.Resolver{
PreferGo: true,
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
l.Debugf("dial DNS server %s instead of %s", dnsAddress, address)
return nd.DialContext(ctx, network, dnsAddress)
},
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor to function, if no DNS servers specified return nil

dial.go Outdated
Comment on lines 108 to 115
lctx, cancel := context.WithTimeout(ctx, d.cfg.ResolveTimeout)
ips, err := d.resolvers[i].LookupIP(lctx, network, hostname)
cancel()
if err != nil {
d.log.Debugf("failed to resolve %s on %s: %s", hostname, d.cfg.DNSServers[i], err)
} else {
return ips, nil
}
Copy link
Contributor

@mmatczuk mmatczuk Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor to method

dial.go Outdated
return nil, err
}

ips, err := d.LookupIP(ctx, "ip", host)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

network is specified in the DNS server entry, this parameter should be removed from LookupIP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http RoundTripper uses tcp network, when resolver.LookupIP needs ip, ipv4 or ipv6

dial.go Outdated
return nil, fmt.Errorf("failed to dial %s on any IP address", address)
}

func (d *Dialer) DialFunc() dialFunc { //nolint:revive,golint // Unexported by design.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the linter. It looks odd and has no purpose really. The Dialer should implement DialContext.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

dial_test.go Outdated
)

func TestDialerLookupHost(t *testing.T) {
if _, ok := os.LookupEnv("CI"); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should run in in CI. We have tests talking to google.com so a DNS query to CF won't hurt.

dial.go Show resolved Hide resolved
dial.go Outdated
resolvers = append(resolvers, net.Resolver{
PreferGo: true,
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
l.Debugf("dial DNS server %s instead of %s", dnsAddress, address)
Copy link
Contributor

@mmatczuk mmatczuk Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I have 5 dns servers?
Would it dial each DNSServer 5 times?

Copy link
Contributor Author

@Choraden Choraden Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would sequentially dial the 5 servers, one after another, unless a lookup succeeds. The net.Dialer dials two times by default.

dial_test.go Outdated
@@ -16,7 +16,7 @@ import (
"github.com/saucelabs/forwarder/log/stdlog"
)

func TestResolverLookupHost(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a test for context cancelation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@mmatczuk
Copy link
Contributor

mmatczuk commented Aug 2, 2023

I'm still not really convinced that using the resolver hack is better.

For instance https://github.com/rs/dnstrace provides you with more robust client supporting recursive queries (with CNAME support).
On top of that we would get DNS over https support for free.

The full list is here maybe you can find something even better https://pkg.go.dev/github.com/miekg/dns?tab=importedby.

@Choraden
Copy link
Contributor Author

Choraden commented Aug 2, 2023

I agree, the resolver hack has its disadvantages, but I would not give up this idea. We can try with the client and compare the results.

@Choraden Choraden force-pushed the hg/dns branch 3 times, most recently from 3c6abaf to 6780733 Compare August 3, 2023 14:34
@Choraden Choraden changed the title dns: rework resolver and add tests dns: fix custom servers support and extend tests Aug 3, 2023
@Choraden
Copy link
Contributor Author

Choraden commented Aug 3, 2023

We abandoned the custom dialer with resolvers idea in favour of using Golang's net package internals. The internals are hidden from user. To overcome it, we link the private function and manually configure dnsConfig. This way we can manipulate the behaviour of internal Golang's Resolver.

dns.go Outdated

"github.com/saucelabs/forwarder/log"
)

// dnsConfig is 1:1 copy of net.dnsConfig struct.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put it to a separate file then.

We are not capable of asserting the hostname is resolved at the right dns server.
The tests will be rewritten to e2e suite where we have more powerfull tools to perform them.
bind/flag.go Outdated
Comment on lines 124 to 118
DNSConfig(fs, &cfg.DNSConfig)
}

func HTTPTransportConfig(fs *pflag.FlagSet, cfg *forwarder.HTTPTransportConfig) {
DialConfig(fs, &cfg.DialConfig, "http")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function order.

@Choraden Choraden force-pushed the hg/dns branch 14 times, most recently from 7fa3e8b to a65d8a7 Compare August 7, 2023 12:51
@Choraden Choraden requested a review from mmatczuk August 7, 2023 14:32
Copy link
Contributor

@mmatczuk mmatczuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those can be squashed.

e2e(dns server setup): increase http dial timeout for proxy Hubert Grochowski 04/08/2023, 09:37
e2e/forwarder(service): add withHTTPDialTimeout method Hubert Grochowski 04/08/2023, 09:35

Overall very well done, work nicely split to commits.

I left some minor comments inline.

LGTM

dial.go Outdated
KeepAlive time.Duration
}

func DefaultDialConfig() DialConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we return a ptr here?

dial.go Outdated
nd *net.Dialer
}

func NewDialer(cfg DialConfig) (*Dialer, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here pass a ptr, Dialer keeps a copy like it does now.

@@ -0,0 +1,51 @@
// Copyright 2023 Sauce Labs Inc. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this file should also have the Go license in it.

It would be also good to point to the file it's based on.

@@ -0,0 +1,41 @@
// Copyright 2023 Sauce Labs Inc. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in dnsconf.go

bind/flag.go Outdated
fs.VarP(anyflag.NewSliceValue[netip.AddrPort](nil, &cfg.Servers, forwarder.ParseDNSAddress),
"dns-server", "n", "<ip>[:<port>]"+
"DNS server(s) to use instead of system default. "+
"If specified multiple times, the first one is used as primary server, the rest are used as fallbacks. "+
"If specified multiple times, the first one is used as primary server, the rest are used as fallbacks, "+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a note on --dns-round-robin

bind/flag.go Outdated
"Only used if DNS servers are specified. ")

fs.BoolVar(&cfg.RoundRobin, "dns-round-robin", cfg.RoundRobin, "Enable round-robin DNS server selection. "+
"Won't have any effect if DNS servers are not specified. ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is adds no value. Add something more relevant

If more than one DNS server is specified with the --dns-server flag, passing this flag ....

@@ -36,7 +38,14 @@ func (c *command) RunE(cmd *cobra.Command, args []string) error {
logger := stdlog.New(c.logConfig)
logger.Debugf("configuration\n%s", config)

t, err := forwarder.NewHTTPTransport(c.httpTransportConfig, nil)
if len(c.dnsConfig.Servers) > 0 {
logger.Named("dns").Infof("using DNS servers %v", c.dnsConfig.Servers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nicer using strings.Join(c.dnsConfig.Servers, ", ",)

e2e/setups.go Outdated
networkName = "forwarder-e2e_default"
httpbinIPAddr = "192.168.100.10"
proxyIPAddr = "192.168.100.11"
dnsIPAddr = "192.168.100.13"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nicer if DNS came first with lower IPs.

This patch separates the dialer to its custom wrapper.
This not only introduces a good structure to the package, but also is nice foundation to #118.
Use net package internal functions to configure custom dns options.
e2e/forwarder(service): add withHTTPDialTimeout method

Default value for HTTPDialTiemout is 10s. Default value for DNSTimeout is 5s.
When test queried invalid address on two dns servers it would result in nondeterministic return code, as sometimes the dial timed out and sometimes the query failed.
Forwarder uses go:linkname directive. It may lead to unsafe behavior if the linked code changes.
This workflow performs a primitive check by calculating the checksums of linked modules.
If the checksums do not match, a user is forced to manually check the changes and update the checksums.

The action runs on '.version' file changes.
@Choraden
Copy link
Contributor Author

Choraden commented Aug 8, 2023

@mmatczuk I've addressed your comments. Merge when you are ready.

Increase health check start period and remove interval timeout as the test always succeeds.
@Choraden
Copy link
Contributor Author

Choraden commented Aug 8, 2023

@mmatczuk I also tweaked sc-2450 setup health check values as the server got frozen in the previous run.

@mmatczuk mmatczuk merged commit 667dd88 into main Aug 8, 2023
3 checks passed
@mmatczuk mmatczuk deleted the hg/dns branch August 8, 2023 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants