Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regression in source IP spoofing #1085

Merged
merged 2 commits into from
May 28, 2024

Conversation

howardjohn
Copy link
Member

We were using the cfg in pool since #1040. cfg is not whether its used, but rather the intent of the user -- we may enable or disable at runtime.

This renames the field to make it clear it should not be used like this, and fixes the issue. This went through CI due to
#1084.

We were using the `cfg` in pool since istio#1040. `cfg` is not whether its
used, but rather the intent of the user -- we may enable or disable at
runtime.

This renames the field to make it clear it should not be used like this,
and fixes the issue. This went through CI due to
istio#1084.
@howardjohn howardjohn requested a review from a team as a code owner May 22, 2024 17:02
@istio-testing istio-testing added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 22, 2024
@ilrudie ilrudie self-assigned this May 22, 2024
@ilrudie ilrudie added the do-not-merge/hold Block automatic merging of a PR. label May 22, 2024
Copy link
Contributor

@ilrudie ilrudie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct but I left a few comments where I think it could be better/clearer. added a hold so you can choose address comments. Feel free to drop the hold if you don't want to address.

Comment on lines 65 to 68
let enable_orig_src = pi
.cfg
.explicitly_configure_original_source
.unwrap_or(transparent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of off topic but:

What is the point of enable_orig_src and this second set of logic? It seems like we shouldn't get enable_orig_src and transparent with different values ever.

if the cfg is true then we either return true or an error and if error the we return early at the ? and shouldn't reach this line. so if the config asked for true they are both true effectively.

if cfg contains false we return false so both are false.

if cfg contains none we best effort and return the result, go through the unwrap_or to set enable_orig_src to be the value in transparent.

what am I missing?

Copy link
Contributor

@bleggett bleggett May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the cfg is true then we either return true or an error and if error the we return early at the ? and shouldn't reach this line. so if the config asked for true they are both true effectively.

maybe_set_transparent shouldn't propagate the error yea, it should swallow/log and return the bool.

EDIT: nevermind.

What is the point of enable_orig_src and this second set of logic? It seems like we shouldn't get enable_orig_src and transparent with different values ever.

there's a few states here:

orig_src == true > if we can set transparent, then good, if we cannot, then error.

orig_src == unset -> if we can set transparent, then inform other things we are using orig_src, otherwise inform them we are not.

tl;dr

maybe_set_transparent might return an error, or false - as dictated by the cfg. It's a weird sort of fallthru.

Copy link
Contributor

@ilrudie ilrudie May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if maybe set_transparent returns an error then we never get here though. we return early at the ?. Aside from that if maybe_set_returns an error then we are in conflict and we just treat error as meaning false so as to not return early we wind up at a different problem. The active state would be false even though the requested state is true. Setting the variable that contains the active state to be true because the user asked for it means that variable no longer actually reflects the active state so that seems wrong to me.

tldr, seems like if we want to know the active state we should just use transparent which contains the active state. I think that is what we are doing, in a round about way, so I don't see a bug in the present logic it just seems like an unnecessary step is being performed.

Copy link
Contributor

@bleggett bleggett May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just panic if the config flag is set and we can't do maybe_set_transparent because of lack of perms, then we can just use transparent in all cases?

I'm not sure we care about any other permutations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if maybe set_transparent returns an error then we never get here though. we return early at the ?

that's true, but it will not always return an error, whether it does or not is dependent on this flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it either has an error or it contains the state that resulted when we tried to set transparent, yeah? if it's an error then enable_orig_src doesn't matter because we never reach it. If its not an error then we really just want what is in transparent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If its not an error then we really just want what is in transparent

unless enable_orig_src was explicitly set to false as user intent, in which case we want to discard the result of the attempt and use the user intent.

It's goofy. I think we should use 2 config fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if the user asked for false we just don't attempt to set transparent and return false

https://github.com/howardjohn/ztunnel/blob/2eae711bc42832d928f2d16d3dfc7687bd34c09a/src/proxy.rs#L409

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intent (and, afaik, implementation) is:

  • default, None: try to set transparent (requires privs), if it doesn't work proceed with original_src=false
  • Some(true): set transparent or fail. use original src
  • Some(false): don't bother trying transparent, do not use original src

The higher level intent is to make it so we gracefully degrade without privs, but can explicitly require/not

@@ -337,6 +336,7 @@ impl WorkloadHBONEPool {
// Callers should then be safe to drop() the pool instance.
pub fn new(
cfg: Arc<crate::config::Config>,
original_source: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could probably be made more clear that this is expected to reflect running state rather than user intent. Maybe just add to the comment on new. I don't really know how to better name the parameter to add that kind of information.

Copy link
Contributor

@bleggett bleggett May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea is to convey that thru mutability, cfg isn't mutable, so it's by-definition fixed config/intent - you can't change it. If you want to calc something at runtime, calculate it, bind it, and pass it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree this should be a parameter. What I was attempting to "solve for" is misuse in the future by accidentally just setting this parameter to what's in the config because it's not actually all that clear what the parameter is for. Then we're sort of right back here.

Copy link
Contributor

@bleggett bleggett May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's why I think the config flag should just be two flags - the main problem is that the config property itself has overloaded meaning that's not obvious from looking at it: #1085 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it is we have the advantage of not being able to specify an invalid state... what if they set require and disable to true?

Copy link
Contributor

@bleggett bleggett May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it a 3 way enum - the potential for invalid state is sort of an inescapable aspect of you telling us to do A but us maybe not being able to do it because of circumstances beyond our control.

.enable_original_source
.unwrap_or_default()
.then_some(key.src);
let local = self.original_source.then_some(key.src);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh yeah I missed this.

Copy link
Contributor

@ilrudie ilrudie May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well back when config was mut I think we mutated the meaning of that field from "user intent" to "active state" so it is pretty subtle

src/config.rs Outdated
pub enable_original_source: Option<bool>,
// If set, explicitly configure whether to use original source.
// If unset (recommended), this is automatically detected based on permissions.
pub explicitly_configure_original_source: Option<bool>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub explicitly_configure_original_source: Option<bool>,
// If set, throw an error if we lack the permissions to enable original_source.
// If unset (recommended), use of original_source is automatically detected based on permissions.
pub require_original_source: Option<bool>,

maybe this?

Copy link
Contributor

@bleggett bleggett May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe we just split this into two, since the preferred default is autodetection.

require_original_source
disable_original_source

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just a 3 way enum -

orig_src: enable | disable | autodetect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just a 3 way enum

If it wasn't clear, that is what Option<bool> is (just in a more confusing representation). Agree an enum is way more explicit.

Copy link
Contributor

@ilrudie ilrudie May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there are two aspects to this:

  1. internal representation, enum is awesome if we want to go to there, I have no problem changing from Option<bool> to something clearer internally.
  2. external representation, enum is basically unsupported so it's string or int... I think keeping the optional bool and adding documentation if we feel it is unclear makes sense for this.

Comment on lines 65 to 68
let enable_orig_src = pi
.cfg
.explicitly_configure_original_source
.unwrap_or(transparent);
Copy link
Contributor

@bleggett bleggett May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just panic if the config flag is set and we can't do maybe_set_transparent because of lack of perms, then we can just use transparent in all cases?

I'm not sure we care about any other permutations.

@ilrudie
Copy link
Contributor

ilrudie commented May 22, 2024

I think we should just panic if the config flag is set and we can't do maybe_set_transparent because of lack of perms, then we can just use transparent in all cases?

I didn't check what we do with the returned error but this doesn't seem recoverable really. Perhaps we send some zds response that we can't configure or something along those lines...

@howardjohn howardjohn removed the do-not-merge/hold Block automatic merging of a PR. label May 28, 2024
@istio-testing istio-testing merged commit 6532c55 into istio:master May 28, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants