-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add antctl get fqdncache #6868
base: main
Are you sure you want to change the base?
Add antctl get fqdncache #6868
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the debug logs before pusing a PR, or mark this WIP/draft for now. Also, your change seems to break a ton of AgentQuerier
interface implementations if you check the lint failures
ae05e95
to
6a561ee
Compare
This PR adds functionality for a new antctl command: antctl get fqdncache This command fetches the DNS mapping entries for FQDN policies by reading the cache for DNS entries and outputting FQDN name, associated IP, and expiration time. It can also be used with a --domain (-d) flag that can be specified to filter the result for only a certain FQDN name. Signed-off-by: Dhruv-J <[email protected]>
0f68993
to
17487a4
Compare
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
/test-all |
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
/test-conformance |
"time" | ||
|
||
"github.com/stretchr/testify/require" | ||
"gotest.tools/assert" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use github.com/stretchr/testify/assert
to avoid extra dependency added in go.mod
hack/update-codegen.sh
Outdated
@@ -34,6 +34,7 @@ IMAGE_NAME="antrea/codegen:kubernetes-1.31.1-build.1" | |||
# to the "safe" list in the Git config when starting the container (required | |||
# because of user mismatch). | |||
if git_status=$(git status --porcelain --untracked=no 2>/dev/null) && [[ -n "${git_status}" ]]; then | |||
echo ${git_status} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated change
multicluster/hack/update-codegen.sh
Outdated
@@ -34,6 +34,7 @@ IMAGE_NAME="antrea/codegen:kubernetes-1.31.1-build.1" | |||
# to the "safe" list in the Git config when starting the container (required | |||
# because of user mismatch). | |||
if git_status=$(git status --porcelain --untracked=no 2>/dev/null) && [[ -n "${git_status}" ]]; then | |||
echo ${git_status} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
pkg/agent/apis/types.go
Outdated
return false | ||
} | ||
|
||
// maybe some helper funcs needed here to help parse through ^^^^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this means
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was a dev note, removing now, as functionality has been added
for _, dnsCacheEntryExp := range expectedEntryList { | ||
found := false | ||
for j, dnsCacheEntryRet := range returnedList { | ||
if !found && dnsCacheEntryExp.FqdnName == dnsCacheEntryRet.FqdnName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try if assert.ElementsMatch
works out of the box for comparing those lists?
pkg/agent/apis/types.go
Outdated
} | ||
} | ||
|
||
func (r FqdnCacheResponse) SortRows() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the default option for sorting false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the rows of the table won't need to be sorted, as there's no filter or flag for it, so the default is false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember how this function is used exactly, but it would be nice if the output rows were filtered based on the FQDN.
assert.Equal(t, tt.expectedStatus, recorder.Code) | ||
if tt.expectedStatus == http.StatusOK { | ||
var received []map[string]interface{} | ||
err = json.Unmarshal(recorder.Body.Bytes(), &received) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any tests to add to check for marshal and unmarshal operation correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
triggering this error manually would be difficult, as the json transformation is not something the test is directly in control of, it is also not tested on other PRs which add antctl functionality
t.Run(tt.name, func(t *testing.T) { | ||
reqByte, _ := json.Marshal(tt.fqdnList) | ||
reqReader := bytes.NewReader(reqByte) | ||
result, err := Transform(reqReader, false, tt.opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any test to add specific transform function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the transform function is tested as part of response.go, if it fails, then the test will fail as well
pkg/agent/apis/types.go
Outdated
expirationTime time.Time | ||
} | ||
|
||
func (r FqdnCacheResponse) GetTableHeader() []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check if we can add specific test cases to cover this in code cov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems to be left uncovered conventionally, as it is a simple method that primarily relies on the Transform() function to execute correctly, I don't think unit tests should be added for these funcs
there's also kind e2e test noEncap mode also failing |
Signed-off-by: Dhruv-J <[email protected]>
/test-conformance |
pkg/agent/apis/types.go
Outdated
@@ -72,6 +74,28 @@ func (r AntreaAgentInfoResponse) SortRows() bool { | |||
return true | |||
} | |||
|
|||
type FqdnCacheResponse struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be FQDNCacheResponse
per our coding conventions
you can refer to Quan's document: https://github.com/tnqn/code-review-comments
pkg/agent/apis/types.go
Outdated
} | ||
} | ||
|
||
func (r FqdnCacheResponse) SortRows() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember how this function is used exactly, but it would be nice if the output rows were filtered based on the FQDN.
if dnsEntryCache == nil { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that your GetFqdnCache
function will never return nil
as far as I can tell. It's fine to keep the code as is, but maybe add a comment to this effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was for the older function, when it would return nil instead of empty, but that's fixed now, so I can remove this bit of code.
"*.example.com": { | ||
responseIPs: map[string]ipWithExpiration{ | ||
"maps.example.com": { | ||
ip: net.ParseIP("10.0.0.1"), | ||
expirationTime: time.Date(2025, 12, 25, 15, 0, 0, 0, time.UTC), | ||
}, | ||
"mail.example.com": { | ||
ip: net.ParseIP("10.0.0.2"), | ||
expirationTime: time.Date(2025, 12, 25, 15, 0, 0, 0, time.UTC), | ||
}, | ||
"photos.example.com": { | ||
ip: net.ParseIP("10.0.0.3"), | ||
expirationTime: time.Date(2025, 12, 25, 15, 0, 0, 0, time.UTC), | ||
}, | ||
}, | ||
}, | ||
"antrea.io": { | ||
responseIPs: map[string]ipWithExpiration{ | ||
"antrea.io": { | ||
ip: net.ParseIP("10.0.0.4"), | ||
expirationTime: time.Date(2025, 12, 25, 15, 0, 0, 0, time.UTC), | ||
}, | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test input data doesn't match what we actually store in the cache
There is never a wildcard domain ("*.example.com"
) as the key in the dnsEntryCache
map
For the nested map (map[string]ipWithExpiration
) the key is actually the string representation of the IP (for historical reasons). See:
antrea/pkg/agent/controller/networkpolicy/fqdn.go
Lines 79 to 82 in 357d38f
// Key for responseIPs is the string representation of the IP. | |
// It helps to quickly identify IP address updates when a | |
// new DNS response is received. | |
responseIPs map[string]ipWithExpiration |
So it's actually like this:
"antrea.io": {
responseIPs: map[string]ipWithExpiration{
""10.0.0.4"": {
ip: net.ParseIP("10.0.0.4"),
expirationTime: time.Date(2025, 12, 25, 15, 0, 0, 0, time.UTC),
},
},
},
recorder := httptest.NewRecorder() | ||
handler.ServeHTTP(recorder, req) | ||
assert.Equal(t, tt.expectedStatus, recorder.Code) | ||
if tt.expectedStatus == http.StatusOK { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just remove the if here, as the status code is always 200
Note that this unit test actually validates very little (which is expected), so in my opinion we didn't really need multiple test cases, just the second one ("FQDN cache exists - multiple addresses multiple domains"
) would have been fine as a basic (not table-driven) test.
if err != nil { | ||
return nil, err | ||
} | ||
domain, exists := opts["domain"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why you chose this approach, but I think we usually do server-side filtering. Did you follow a specific example that prompted you to do client-side filtering?
The advantage of server-side filtering is that the dns cache can be huge, so being able to filter on a specific domain name in the Agent directly would be more efficient.
BTW, I think we should support a wildcard domain in the query, as that would be useful when debugging a policy rule which uses a wildcard in the FQDN. cc @Dyanngg
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv Jain <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
cacheEntryList := []types.DnsCacheEntry{} | ||
var pattern *regexp.Regexp | ||
var err error | ||
if fqdnFilter != (querier.FQDNCacheFilter{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite unconventional imo, and I doubt it works the way you think it would. In newFilterFromURLQuery
, it would return a new FQDNCacheFilter with "DomainName" set to "" if no filter is provided in the url, thus not matching the else branch below. What can be done is making the fqdnFilter
a pointer, and check if is nil to see if there are any filtering requirements at all
pkg/querier/querier.go
Outdated
// FQDNCacheFilter is used to filter the result while retrieving FQDN cache | ||
type FQDNCacheFilter struct { | ||
// The Name or wildcard matching expression of the domain that is being filtered | ||
DomainName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'll just call it Domain
since it can actually be a regex
} | ||
for fqdn, dnsMeta := range c.fqdnController.dnsEntryCache { | ||
for _, ipWithExpiration := range dnsMeta.responseIPs { | ||
if fqdnFilter == (querier.FQDNCacheFilter{}) || pattern.MatchString(fqdn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fqdnFilter == nil will also be a much more appropriate check here to avoid constructing an empty FQDNCacheFilter
for each dns data in cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a key issue with the PR right now. You have several types that you use for the same purpose but they are not used correctly IMO.
FQDNCacheResponse
inpkg/agent/apis
DnsCacheEntry
inpkg/agent/types
Response
inpkg/antctl/transform/fqdncache
, which wraps a*DnsCacheEntry
.
FQDNCacheResponse
is actually not really used at all. It is definitely not used on the server side, as you are marshaling a []DnsCacheEntry
. It does not have any unexported
DnsCacheEntry
(and everything in pkg/agent/types
) is not really meant to be used in antctl - you will notice that antctl didn't import this package prior to your change. There is also nothing with json
tags in pkg/agent/types
. This is what the pkg/agent/apis
is meant to be for.
The data you marshal on the server side and unmarshal on the client side (antctl) should have the same type to ensure correctness.
There are 2 possible approaches:
- Fix
FQDNCacheResponse
(export fields, add json tags, maybe replace thenet.IP
with a string). In the handler, convert eachDnsCacheEntry
to aFQDNCacheResponse
manually. You can refer thebgp*
agent handlers for examples. json tags should be removed fromDnsCacheEntry
, as this type is not meant to be serialized. - Unify
FQDNCacheResponse
andDnsCacheEntry
as a single type (there again, I would use astring
instead of anet.IP
for the IP address field), and use it everywhere. We have one handler like this:pkg/agent/apiserver/handlers/serviceexternalip/handler.go
, with theServiceExternalIPInfo
struct.
In both cases I think you can / should remove pkg/antctl/transform/fqdncache/response.go
.
I don't have a strong preference between the 2 approaches. Typically I'd go with the first one and keep 2 types. The conversion logic is a good place to convert the net.IP
field in DnsCacheEntry
to a string
field in FQDNCacheResponse
.
pkg/agent/querier/querier.go
Outdated
@@ -47,6 +48,7 @@ type AgentQuerier interface { | |||
GetMemberlistCluster() memberlist.Interface | |||
GetNodeLister() corelisters.NodeLister | |||
GetBGPPolicyInfoQuerier() querier.AgentBGPPolicyInfoQuerier | |||
GetFqdnCache(querier.FQDNCacheFilter) []types.DnsCacheEntry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong: there is no need to add this function to the interface because you can already call GetNetworkPolicyInfoQuerier().GetFqdnCache(filter)
, which is what we do for other handlers.
func HandleFunc(aq agentquerier.AgentQuerier) http.HandlerFunc { | ||
return func(w http.ResponseWriter, r *http.Request) { | ||
fqdnFilter := newFilterFromURLQuery(r.URL.Query()) | ||
dnsEntryCache := aq.GetFqdnCache(fqdnFilter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be GetFQDNCache
per the style guide. Please update all other occurrences accordingly.
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv Jain <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv Jain <[email protected]>
pkg/agent/apis/types.go
Outdated
FqdnName string `json:"fqdnName,omitempty"` | ||
IpAddress string `json:"ipAddress,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FqdnName string `json:"fqdnName,omitempty"` | |
IpAddress string `json:"ipAddress,omitempty"` | |
FQDNName string `json:"fqdnName,omitempty"` | |
IPAddress string `json:"ipAddress,omitempty"` |
Please review other occurrences as well
return func(w http.ResponseWriter, r *http.Request) { | ||
fqdnFilter := newFilterFromURLQuery(r.URL.Query()) | ||
dnsEntryCache := npq.GetFQDNCache(fqdnFilter) | ||
resp := []agentapi.FQDNCacheResponse{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resp := []agentapi.FQDNCacheResponse{} | |
resp := make([]agentapi.FQDNCacheResponse, 0, len(dnsEntryCache)) |
to pre-allocate the memory
{ | ||
name: "FQDN cache exists - multiple addresses multiple domains", | ||
expectedStatus: http.StatusOK, | ||
expectedResponse: []types.DnsCacheEntry{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this field to filteredCacheEntries
and make expectedResponse
a field of type []FQDNCacheResponse
instead. It will feel like you are duplicating data, but really it will lead to a cleaner test.
if err != nil { | ||
// this pattern will match no strings if there is an error with the regex formatting or usage with the user specified --domain flag | ||
pattern = regexp.MustCompile(`a\A`) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should handle such errors better, i.e., by returning a BadRequest HTTP code if the domain
arg is not valid.
so maybe the regex conversion should happen in the handler and FQDNCacheResponse
should include a regex instead of a string? cc @Dyanngg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about a warning displayed to the user but still an empty output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's an invalid pattern, it should be an error. A warning in the Antrea Agent logs is much harder to find. We should not treat 1) an unsupported pattern, and 2) a valid pattern that doesn't match anything, in the same way IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops I thought you were done addressing comments, but looks like it is not the case
I will review again later
} else { | ||
// this pattern will match all strings if the filter is unset | ||
pattern = regexp.MustCompile(`.*`) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed, you should just skip regex evaluation in this case
Signed-off-by: Dhruv-J <[email protected]>
Signed-off-by: Dhruv-J <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more comments
FqdnName: "example.com", | ||
IpAddress: net.ParseIP("10.0.0.1"), | ||
FQDNName: "example.com", | ||
IPAddress: net.ParseIP("10.0.0.1"), | ||
ExpirationTime: time.Date(2025, 12, 25, 15, 0, 0, 0, time.UTC), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the same time is used everywhere, I suggest you declare one variable of type time.Time
at the beginning of the test function, which you can then use everywhere. Actually, we usually do something like this:
expirationTime := time.Now().Add(1*time.Hour)
then use ExpirationTime: expirationTime
everywhere
expectedResponse: []apis.FQDNCacheResponse{ | ||
{ | ||
FQDNName: "example.com", | ||
IPAddress: net.ParseIP("10.0.0.1").String(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IPAddress: net.ParseIP("10.0.0.1").String(), | |
IPAddress: "10.0.0.1", |
same in other places below
handler.ServeHTTP(recorder, req) | ||
assert.Equal(t, tt.expectedStatus, recorder.Code) | ||
var receivedResponse []map[string]interface{} | ||
err = json.Unmarshal(recorder.Body.Bytes(), &receivedResponse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unmarshal the response as a []apis.FQDNCacheResponse
, not as a []map[string]interface{}
and then just use assert.Equal(t, tt.expectedResponse, receivedResponse)
responseIPs: map[string]ipWithExpiration{ | ||
"10.0.0.1": { | ||
ip: net.ParseIP("10.0.0.1"), | ||
expirationTime: time.Date(2025, 12, 25, 15, 0, 0, 0, time.UTC), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as for the handler unit test
if err != nil { | ||
// this pattern will match no strings if there is an error with the regex formatting or usage with the user specified --domain flag | ||
pattern = regexp.MustCompile(`a\A`) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops I thought you were done addressing comments, but looks like it is not the case
I will review again later
Signed-off-by: Dhruv-J <[email protected]>
This PR adds functionality for a new antctl command: antctl get fqdncache This command fetches the DNS mapping entries for FQDN policies by reading the cache for DNS entries and outputting FQDN name, associated IP, and expiration time. It can also be used with a --domain (-d) flag that can be specified to filter the result for only a certain FQDN name.