When low_loss increases, why does annualized loss decrease? #11

Osipion · 2020-03-12T08:35:35Z

First off, great effort - really good to see quantitative approaches to InfoSec risk, thanks. This may be a stats newbie question, but consider the following test.csv:

A,DoS,0.5,1,100000

The low_loss value is $1, and the high_loss is $100,000. The output of riskquant --file test.csv is:

A,DoS,"$72,200"

So, the annualized loss is $72,200 (72% of the high_loss).

When we increment the low_loss value to $1000 without changing the high_loss like so:

A,DoS,0.5,1000,100000

The output of riskquant --file test.csv is:

A,DoS,"$13,300"

So now the annualized loss is only %13 of the high_loss, which seems counter intuitive - I would have thought that raising the low_loss whilst holding all other things equal would have increased the annualized loss. If we continue raising the low_loss value, we get some interesting behaviour:

A,DoS,0.5,99999,100000

Gives an annualized loss of $50,000, so it appears to go back up but still not quite as high as when the lowest loss was 1.

What am I missing here? Is this right? If so, why?

The text was updated successfully, but these errors were encountered:

Osipion · 2020-03-12T11:40:38Z

It appears this happens for the first ~0.5% of the range, after which it starts to behave as I would expect (losses increase as the low_loss increases). I pulled out the SimpleLoss code into a notebook:

out_al = widgets.Output()

def distribution(frequency, low_loss, high_loss):
    factor = -0.5 / norm.ppf(0.05)
    mu = (math.log(low_loss) + math.log(high_loss)) / 2.
    shape = factor * (math.log(high_loss) - math.log(low_loss))
    return lognorm(shape, scale=math.exp(mu))

def annualized_loss(frequency, low_loss, high_loss):
    return frequency * distribution(frequency, low_loss, high_loss).mean()

def gen_al_curve(freq, mx):
    r = list(range(1, mx))
    line = []
    for i in r:
        l = annualized_loss(freq, i, mx)
        if len(line) > 0 and l > line[len(line) - 1]:
            print(f'inflects at {i} ({l})')
        line.append(l)
    plt.plot(line)
    with out_al:
        plt.show()
    
display(out_al)
gen_al_curve(0.5, 10000)

And see the following:

Which shows the inflection happens when low_loss is 46/10,000 (0.46% of the high_loss). So I guess this is an expected behaviour for "too small" values?

But this does lead to an interesting problem. Say I was trying to model the risk posed by MITRE ATT&CK Impact T1531 (Account Access Removal). Loss of access to a single account might have an average cost of $2 or so and be very likely (say 99% chance it will happen to at least one customer a year). But a large breach that resulted in thousands of customers loosing access to accounts could be very costly (lets say $1,000,000). If I model this as one loss scenario I would have:

frequency: 0.99
low_impact: $2
high_impact: $1,000,000
annualized_loss: $3,992,790.4

Which doesn't seem to accurately capture the risk (in 1 year I'm getting roughly 4 * high_impact).

I know I've put a bit of a wall of text here, so to summarize:

Is the initial decrease in annualized loss expected and correct?
Do I need to be cautious of this when I structure my loss scenarios?
Am I picking my scenarios wrong (e.g. should a single account losing access be a completely different scenario from multiple accounts losing access)?

mdeshon · 2020-03-15T06:44:14Z

Re: summary 1, yes it's expected. The lognormal shape is proportional to the difference log(hi) - log(low) which is equivalent to saying it's the ratio of high/low. The lognormal is by definition non-negative, with high shape values causing the left side of the distribution to be very steep near the origin, which causes the right side of the distribution to explode out to higher values. The Wikipedia entry for lognormal distribution has a diagram which shows this effect (where sigma is the shape value).

Re: summary 2, yes, if you're using the lognormal, usually you don't want the high/low ratio to be so high (500,000 in your example) because the distribution is unusually skewed, leading to the unrealistic values for annual loss that you observed.

There are a few things I can suggest: first, you're treating the frequency as a probability that can't exceed 1, but it can exceed 1 when you intend to say that it is expected to occur more than once a year on average. (Note that I had a bug that enforced frequency <=1 but that's fixed since PR#10. Sorry if that caused confusion)

Re: summary 3, It does seem like the single account scenario is distinct from the "large breach" situation. In that case, you could have the small-scale loss have a smaller high end loss, while the large breach starts at a higher low loss.

For example (just for illustration, I am sure you can come up with better numbers):

Small account loss: Frequency: 10, Low impact: $2, High impact: $1000
Large scale breach: Frequency: 0.2, Low impact: $10,000, High impact: $1,000,000

>>> s1 = simpleloss.SimpleLoss('SMALL', 'Small loss', 10, 2, 1000)
>>> s2 = simpleloss.SimpleLoss('LARGE', 'Large loss', 0.2, 10000, 1000000)
>>> s1.annualized_loss()
2663.505611088662
>>> s2.annualized_loss()
53279.60195471087

which seems to capture your desired behavior better.

Osipion changed the title ~~When low_loss increases, why does annualized loss decrease?~~ When low_loss increases, why does annualized loss decrease? Mar 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When low_loss increases, why does annualized loss decrease? #11

When low_loss increases, why does annualized loss decrease? #11

Osipion commented Mar 12, 2020 •

edited

Loading

Osipion commented Mar 12, 2020 •

edited

Loading

mdeshon commented Mar 15, 2020 •

edited

Loading

When low_loss increases, why does annualized loss decrease? #11

When low_loss increases, why does annualized loss decrease? #11

Comments

Osipion commented Mar 12, 2020 • edited Loading

Osipion commented Mar 12, 2020 • edited Loading

mdeshon commented Mar 15, 2020 • edited Loading

Osipion commented Mar 12, 2020 •

edited

Loading

Osipion commented Mar 12, 2020 •

edited

Loading

mdeshon commented Mar 15, 2020 •

edited

Loading