Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependency tensorflow-gpu to v2 [SECURITY] #193

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Aug 6, 2024

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
tensorflow-gpu ==1.14.0 -> ==2.4.0 age adoption passing confidence

GitHub Vulnerability Alerts

CVE-2020-5215

Impact

Converting a string (from Python) to a tf.float16 value results in a segmentation fault in eager mode as the format checks for this use case are only in the graph mode.

This issue can lead to denial of service in inference/training where a malicious attacker can send a data point which contains a string instead of a tf.float16 value.

Similar effects can be obtained by manipulating saved models and checkpoints whereby replacing a scalar tf.float16 value with a scalar string will trigger this issue due to automatic conversions.

This can be easily reproduced by tf.constant("hello", tf.float16), if eager execution is enabled.

Patches

We have patched the vulnerability in GitHub commit 5ac1b9.

We are additionally releasing TensorFlow 1.15.1 and 2.0.1 with this vulnerability patched.

TensorFlow 2.1.0 was released after we fixed the issue, thus it is not affected.

We encourage users to switch to TensorFlow 1.15.1, 2.0.1 or 2.1.0.

For more information

Please consult SECURITY.md for more information regarding the security model and how to contact us with issues and questions.

CVE-2020-15190

Impact

The tf.raw_ops.Switch operation takes as input a tensor and a boolean and outputs two tensors. Depending on the boolean value, one of the tensors is exactly the input tensor whereas the other one should be an empty tensor.

However, the eager runtime traverses all tensors in the output:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/core/common_runtime/eager/kernel_and_device.cc#L308-L313

Since only one of the tensors is defined, the other one is nullptr, hence we are binding a reference to nullptr. This is undefined behavior and reported as an error if compiling with -fsanitize=null. In this case, this results in a segmentation fault

Patches

We have patched the issue in da8558533d925694483d2c136a9220d6d49d843c and will release a patch release for all affected versions.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-15194

Impact

The SparseFillEmptyRowsGrad implementation has incomplete validation of the shapes of its arguments:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/core/kernels/sparse_fill_empty_rows_op.cc#L235-L241

Although reverse_index_map_t and grad_values_t are accessed in a similar pattern, only reverse_index_map_t is validated to be of proper shape. Hence, malicious users can pass a bad grad_values_t to trigger an assertion failure in vec, causing denial of service in serving installations.

Patches

We have patched the issue in 390611e0d45c5793c7066110af37c8514e6a6c54 and will release a patch release for all affected versions.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability is a variant of GHSA-63xm-rx5p-xvqr

CVE-2020-15195

Impact

The implementation of SparseFillEmptyRowsGrad uses a double indexing pattern:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/core/kernels/sparse_fill_empty_rows_op.cc#L263-L269

It is possible for reverse_index_map(i) to be an index outside of bounds of grad_values, thus resulting in a heap buffer overflow.

Patches

We have patched the issue in 390611e0d45c5793c7066110af37c8514e6a6c54 and will release a patch release for all affected versions.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-15202

Impact

The Shard API in TensorFlow expects the last argument to be a function taking two int64 (i.e., long long) arguments:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/core/util/work_sharder.h#L59-L60

However, there are several places in TensorFlow where a lambda taking int or int32 arguments is being used:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/core/kernels/random_op.cc#L204-L205
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/core/kernels/random_op.cc#L317-L318

In these cases, if the amount of work to be parallelized is large enough, integer truncation occurs. Depending on how the two arguments of the lambda are used, this can result in segfaults, read/write outside of heap allocated arrays, stack overflows, or data corruption.

Patches

We have patched the issue in 27b417360cbd671ef55915e4bb6bb06af8b8a832 and ca8c013b5e97b1373b3bb1c97ea655e69f31a575. We will release patch releases for all versions between 1.15 and 2.3.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-15203

Impact

By controlling the fill argument of tf.strings.as_string, a malicious attacker is able to trigger a format string vulnerability due to the way the internal format use in a printf call is constructed: https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/core/kernels/as_string_op.cc#L68-L74

This can result in unexpected output:

In [1]: tf.strings.as_string(input=[1234], width=6, fill='-')                                                                     
Out[1]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['1234  '], dtype=object)>                                              
In [2]: tf.strings.as_string(input=[1234], width=6, fill='+')                                                                     
Out[2]: <tf.Tensor: shape=(1,), dtype=string, numpy=array([' +1234'], dtype=object)> 
In [3]: tf.strings.as_string(input=[1234], width=6, fill="h")                                                                     
Out[3]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['%6d'], dtype=object)> 
In [4]: tf.strings.as_string(input=[1234], width=6, fill="d")                                                                     
Out[4]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['12346d'], dtype=object)> 
In [5]: tf.strings.as_string(input=[1234], width=6, fill="o")
Out[5]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['23226d'], dtype=object)>
In [6]: tf.strings.as_string(input=[1234], width=6, fill="x")
Out[6]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['4d26d'], dtype=object)>
In [7]: tf.strings.as_string(input=[1234], width=6, fill="g")
Out[7]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['8.67458e-3116d'], dtype=object)>
In [8]: tf.strings.as_string(input=[1234], width=6, fill="a")
Out[8]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['0x0.00ff7eebb4d4p-10226d'], dtype=object)>
In [9]: tf.strings.as_string(input=[1234], width=6, fill="c")
Out[9]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['\xd26d'], dtype=object)>
In [10]: tf.strings.as_string(input=[1234], width=6, fill="p")
Out[10]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['0x4d26d'], dtype=object)>
In [11]: tf.strings.as_string(input=[1234], width=6, fill='m') 
Out[11]: <tf.Tensor: shape=(1,), dtype=string, numpy=array(['Success6d'], dtype=object)>

However, passing in n or s results in segmentation fault.

Patches

We have patched the issue in 33be22c65d86256e6826666662e40dbdfe70ee83 and will release patch releases for all versions between 1.15 and 2.3.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-15205

Impact

The data_splits argument of tf.raw_ops.StringNGrams lacks validation. This allows a user to pass values that can cause heap overflow errors and even leak contents of memory

>>> tf.raw_ops.StringNGrams(data=["aa", "bb", "cc", "dd", "ee", "ff"], data_splits=[0,8], separator=" ", ngram_widths=[3], left_pad="", right_pad="", pad_width=0, preserve_short_sequences=False)
StringNGrams(ngrams=<tf.Tensor: shape=(6,), dtype=string, numpy=
array([b'aa bb cc', b'bb cc dd', b'cc dd ee', b'dd ee ff',
       b'ee ff \xf4j\xa7q\x7f\x00\x00q\x00\x00\x00\x00\x00\x00\x00\xd8\x9b~\xa8q\x7f\x00',
       b'ff \xf4j\xa7q\x7f\x00\x00q\x00\x00\x00\x00\x00\x00\x00\xd8\x9b~\xa8q\x7f\x00 \x9b~\xa8q\x7f\x00\x00p\xf5j\xa7q\x7f\x00\x00H\xf8j\xa7q\x7f\x00\x00\xf0\xf3\xf7\x85q\x7f\x00\x00`}\xa6\x00\x00\x00\x00\x00`~\xa6\x00\x00\x00\x00\x00\xb0~\xeb\x9bq\x7f\x00'],...

All the binary strings after ee ff are contents from the memory stack. Since these can contain return addresses, this data leak can be used to defeat ASLR.

Patches

We have patched the issue in 0462de5b544ed4731aa2fb23946ac22c01856b80 and will release patch releases for all versions between 1.15 and 2.3.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-15206

Impact

Changing the TensorFlow's SavedModel protocol buffer and altering the name of required keys results in segfaults and data corruption while loading the model. This can cause a denial of service in products using tensorflow-serving or other inference-as-a-service installments.

We have added fixes to this in f760f88b4267d981e13f4b302c437ae800445968 and fcfef195637c6e365577829c4d67681695956e7d (both going into TensorFlow 2.2.0 and 2.3.0 but not yet backported to earlier versions). However, this was not enough, as #​41097 reports a different failure mode.

Patches

We have patched the issue in adf095206f25471e864a8e63a0f1caef53a0e3a6 and will release patch releases for all versions between 1.15 and 2.3. Patch releases for versions between 1.15 and 2.1 will also contain cherry-picks of f760f88b4267d981e13f4b302c437ae800445968 and fcfef195637c6e365577829c4d67681695956e7d.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by Shuaike Dong, from Alipay Tian Qian Security Lab && Lab for Applied Security Research, CUHK.

CVE-2020-15204

Impact

In eager mode, TensorFlow does not set the session state. Hence, calling tf.raw_ops.GetSessionHandle or tf.raw_ops.GetSessionHandleV2 results in a null pointer dereference:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/core/kernels/session_ops.cc#L45

In the above snippet, in eager mode, ctx->session_state() returns nullptr. Since code immediately dereferences this, we get a segmentation fault.

Patches

We have patched the issue in 9a133d73ae4b4664d22bd1aa6d654fec13c52ee1 and will release patch releases for all versions between 1.15 and 2.3.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-15207

Impact

To mimic Python's indexing with negative values, TFLite uses ResolveAxis to convert negative values to positive indices. However, the only check that the converted index is now valid is only present in debug builds:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/lite/kernels/internal/reference/reduce.h#L68-L72

If the DCHECK does not trigger, then code execution moves ahead with a negative index. This, in turn, results in accessing data out of bounds which results in segfaults and/or data corruption.

Patches

We have patched the issue in 2d88f470dea2671b430884260f3626b1fe99830a and will release patch releases for all versions between 1.15 and 2.3.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-15208

Impact

When determining the common dimension size of two tensors, TFLite uses a DCHECK which is no-op outside of debug compilation modes:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/lite/kernels/internal/types.h#L437-L442

Since the function always returns the dimension of the first tensor, malicious attackers can craft cases where this is larger than that of the second tensor. In turn, this would result in reads/writes outside of bounds since the interpreter will wrongly assume that there is enough data in both tensors.

Patches

We have patched the issue in 8ee24e7949a20 and will release patch releases for all versions between 1.15 and 2.3.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-15209

Impact

A crafted TFLite model can force a node to have as input a tensor backed by a nullptr buffer. This can be achieved by changing a buffer index in the flatbuffer serialization to convert a read-only tensor to a read-write one. The runtime assumes that these buffers are written to before a possible read, hence they are initialized with nullptr:
https://github.com/tensorflow/tensorflow/blob/0e68f4d3295eb0281a517c3662f6698992b7b2cf/tensorflow/lite/core/subgraph.cc#L1224-L1227

However, by changing the buffer index for a tensor and implicitly converting that tensor to be a read-write one, as there is nothing in the model that writes to it, we get a null pointer dereference.

Patches

We have patched the issue in 0b5662bc and will release patch releases for all versions between 1.15 and 2.3.

We recommend users to upgrade to TensorFlow 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360 but was also discovered through variant analysis of GHSA-cvpc-8phh-8f45.

CVE-2020-15265

Impact

An attacker can pass an invalid axis value to tf.quantization.quantize_and_dequantize:

tf.quantization.quantize_and_dequantize(
    input=[2.5, 2.5], input_min=[0,0], input_max=[1,1], axis=10)

This results in accessing a dimension outside the rank of the input tensor in the C++ kernel implementation:

const int depth = (axis_ == -1) ? 1 : input.dim_size(axis_);

However, dim_size only does a DCHECK to validate the argument and then uses it to access the corresponding element of an array:

int64 TensorShapeBase<Shape>::dim_size(int d) const {
  DCHECK_GE(d, 0);
  DCHECK_LT(d, dims());
  DoStuffWith(dims_[d]);
}

Since in normal builds, DCHECK-like macros are no-ops, this results in segfault and access out of bounds of the array.

Patches

We have patched the issue in eccb7ec454e6617738554a255d77f08e60ee0808 and will release TensorFlow 2.4.0 containing the patch. TensorFlow nightly packages after this commit will also have the issue resolved.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported in #​42105

CVE-2020-15266

Impact

When the boxes argument of tf.image.crop_and_resize has a very large value, the CPU kernel implementation receives it as a C++ nan floating point value. Attempting to operate on this is undefined behavior which later produces a segmentation fault.

Patches

We have patched the issue in c0319231333f0f16e1cc75ec83660b01fedd4182 and will release TensorFlow 2.4.0 containing the patch. TensorFlow nightly packages after this commit will also have the issue resolved.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported in #​42129

CVE-2020-26266

Impact

Under certain cases, a saved model can trigger use of uninitialized values during code execution. This is caused by having tensor buffers be filled with the default value of the type but forgetting to default initialize the quantized floating point types in Eigen:

struct QUInt8 {
  QUInt8() {}
  // ...
  uint8_t value;
};

struct QInt16 {
  QInt16() {}
  // ...
  int16_t value;
};

struct QUInt16 {
  QUInt16() {}
  // ...
  uint16_t value;
};

struct QInt32 {
  QInt32() {}
  // ...
  int32_t value;
};

Patches

We have patched the issue in GitHub commit ace0c15a22f7f054abcc1f53eabbcb0a1239a9e2 and will release TensorFlow 2.4.0 containing the patch. TensorFlow nightly packages after this commit will also have the issue resolved.

Since this issue also impacts TF versions before 2.4, we will patch all releases between 1.15 and 2.3 inclusive.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

CVE-2020-26267

Impact

The tf.raw_ops.DataFormatVecPermute API does not validate the src_format and dst_format attributes. The code assumes that these two arguments define a permutation of NHWC.

However, these assumptions are not checked and this can result in uninitialized memory accesses, read outside of bounds and even crashes.

>>> import tensorflow as tf
>>> tf.raw_ops.DataFormatVecPermute(x=[1,4], src_format='1234', dst_format='1234')
<tf.Tensor: shape=(2,), dtype=int32, numpy=array([4, 757100143], dtype=int32)>
...
>>> tf.raw_ops.DataFormatVecPermute(x=[1,4], src_format='HHHH', dst_format='WWWW')
<tf.Tensor: shape=(2,), dtype=int32, numpy=array([4, 32701], dtype=int32)>
...
>>> tf.raw_ops.DataFormatVecPermute(x=[1,4], src_format='H', dst_format='W')
<tf.Tensor: shape=(2,), dtype=int32, numpy=array([4, 32701], dtype=int32)>
>>> tf.raw_ops.DataFormatVecPermute(x=[1,2,3,4], 
                                    src_format='1234', dst_format='1253')
<tf.Tensor: shape=(4,), dtype=int32, numpy=array([4, 2, 939037184, 3], dtype=int32)>
...
>>> tf.raw_ops.DataFormatVecPermute(x=[1,2,3,4],
                                    src_format='1234', dst_format='1223')
<tf.Tensor: shape=(4,), dtype=int32, numpy=array([4, 32701, 2, 3], dtype=int32)>
...
>>> tf.raw_ops.DataFormatVecPermute(x=[1,2,3,4],
                                    src_format='1224', dst_format='1423')
<tf.Tensor: shape=(4,), dtype=int32, numpy=array([1, 4, 3, 32701], dtype=int32)>
...
>>> tf.raw_ops.DataFormatVecPermute(x=[1,2,3,4], src_format='1234', dst_format='432')
<tf.Tensor: shape=(4,), dtype=int32, numpy=array([4, 3, 2, 32701], dtype=int32)>
...
>>> tf.raw_ops.DataFormatVecPermute(x=[1,2,3,4],
                                    src_format='12345678', dst_format='87654321')
munmap_chunk(): invalid pointer
Aborted
...
>>> tf.raw_ops.DataFormatVecPermute(x=[[1,5],[2,6],[3,7],[4,8]],           
                                    src_format='12345678', dst_format='87654321')
<tf.Tensor: shape=(4, 2), dtype=int32, numpy=
array([[71364624,        0],
       [71365824,        0],
       [     560,        0],
       [      48,        0]], dtype=int32)>
...
>>> tf.raw_ops.DataFormatVecPermute(x=[[1,5],[2,6],[3,7],[4,8]], 
                                    src_format='12345678', dst_format='87654321')
free(): invalid next size (fast)
Aborted

A similar issue occurs in tf.raw_ops.DataFormatDimMap, for the same reasons:

>>> tf.raw_ops.DataFormatDimMap(x=[[1,5],[2,6],[3,7],[4,8]], src_format='1234',
>>> dst_format='8765')
<tf.Tensor: shape=(4, 2), dtype=int32, numpy=
array([[1954047348, 1954047348],
       [1852793646, 1852793646],
       [1954047348, 1954047348],
       [1852793632, 1852793632]], dtype=int32)>

Patches

We have patched the issue in GitHub commit ebc70b7a592420d3d2f359e4b1694c236b82c7ae and will release TensorFlow 2.4.0 containing the patch. TensorFlow nightly packages after this commit will also have the issue resolved.

Since this issue also impacts TF versions before 2.4, we will patch all releases between 1.15 and 2.3 inclusive.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-26268

Impact

The tf.raw_ops.ImmutableConst operation returns a constant tensor created from a memory mapped file which is assumed immutable. However, if the type of the tensor is not an integral type, the operation crashes the Python interpreter as it tries to write to the memory area:

>>> import tensorflow as tf
>>> with open('/tmp/test.txt','w') as f: f.write('a'*128)
>>> tf.raw_ops.ImmutableConst(dtype=tf.string,shape=2,
                              memory_region_name='/tmp/test.txt')

If the file is too small, TensorFlow properly returns an error as the memory area has fewer bytes than what is needed for the tensor it creates. However, as soon as there are enough bytes, the above snippet causes a segmentation fault.

This is because the alocator used to return the buffer data is not marked as returning an opaque handle since the needed virtual method is not overriden.

Patches

We have patched the issue in GitHub commit c1e1fc899ad5f8c725dcbb6470069890b5060bc7 and will release TensorFlow 2.4.0 containing the patch. TensorFlow nightly packages after this commit will also have the issue resolved.

Since this issue also impacts TF versions before 2.4, we will patch all releases between 1.15 and 2.3 inclusive.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2020-26270

Impact

Running an LSTM/GRU model where the LSTM/GRU layer receives an input with zero-length results in a CHECK failure when using the CUDA backend.

This can result in a query-of-death vulnerability, via denial of service, if users can control the input to the layer.

Patches

We have patched the issue in GitHub commit 14755416e364f17fb1870882fa778c7fec7f16e3 and will release TensorFlow 2.4.0 containing the patch. TensorFlow nightly packages after this commit will also have the issue resolved.

Since this issue also impacts TF versions before 2.4, we will patch all releases between 1.15 and 2.3 inclusive.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

CVE-2020-26271

Impact

Under certain cases, loading a saved model can result in accessing uninitialized memory while building the computation graph. The MakeEdge function creates an edge between one output tensor of the src node (given by output_index) and the input slot of the dst node (given by input_index). This is only possible if the types of the tensors on both sides coincide, so the function begins by obtaining the corresponding DataType values and comparing these for equality:

  DataType src_out = src->output_type(output_index);
  DataType dst_in = dst->input_type(input_index);
  //...

However, there is no check that the indices point to inside of the arrays they index into. Thus, this can result in accessing data out of bounds of the corresponding heap allocated arrays.

In most scenarios, this can manifest as unitialized data access, but if the index points far away from the boundaries of the arrays this can be used to leak addresses from the library.

Patches

We have patched the issue in GitHub commit 0cc38aaa4064fd9e79101994ce9872c6d91f816b and will release TensorFlow 2.4.0 containing the patch. TensorFlow nightly packages after this commit will also have the issue resolved.

Since this issue also impacts TF versions before 2.4, we will patch all releases between 1.15 and 2.3 inclusive.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

CVE-2021-29619

Impact

Passing invalid arguments (e.g., discovered via fuzzing) to tf.raw_ops.SparseCountSparseOutput results in segfault.

Patches

We have patched the issue in GitHub commit 82e6203221865de4008445b13c69b6826d2b28d9.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

CVE-2021-29618

Impact

Passing a complex argument to tf.transpose at the same time as passing conjugate=True argument results in a crash:

import tensorflow as tf
tf.transpose(conjugate=True, a=complex(1))

Patches

We have received a patch for the issue in GitHub commit 1dc6a7ce6e0b3e27a7ae650bfc05b195ca793f88.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported in #​42105 and fixed in #​46973.

CVE-2021-29617

Impact

An attacker can cause a denial of service via CHECK-fail in tf.strings.substr with invalid arguments:

import tensorflow as tf
tf.strings.substr(input='abc', len=1, pos=[1,-1])
import tensorflow as tf
tf.strings.substr(input='abc', len=1, pos=[1,2])

Patches

We have received a patch for the issue in GitHub commit 890f7164b70354c57d40eda52dcdd7658677c09f.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported in #​46900 and fixed in #​46974.

CVE-2021-29616

Impact

The implementation of TrySimplify has undefined behavior due to dereferencing a null pointer in corner cases that
result in optimizing a node with no inputs.

Patches

We have patched the issue in GitHub commit e6340f0665d53716ef3197ada88936c2a5f7a2d3.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

CVE-2021-29615

Impact

The implementation of ParseAttrValue can be tricked into stack overflow due to recursion by giving in a specially crafted input.

Patches

We have patched the issue in GitHub commit e07e1c3d26492c06f078c7e5bf2d138043e199c1.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

CVE-2021-29614

Impact

The implementation of tf.io.decode_raw produces incorrect results and crashes the Python interpreter when combining fixed_length and wider datatypes.

import tensorflow as tf

tf.io.decode_raw(tf.constant(["1","2","3","4"]), tf.uint16, fixed_length=4)

The implementation of the padded version is buggy due to a confusion about pointer arithmetic rules.

First, the code computes the width of each output element by dividing the fixed_length value to the size of the type argument:

int width = fixed_length / sizeof(T);

The fixed_length argument is also used to determine the size needed for the output tensor:

TensorShape out_shape = input.shape();
out_shape.AddDim(width);
Tensor* output_tensor = nullptr;
OP_REQUIRES_OK(context, context->allocate_output("output", out_shape, &output_tensor));

auto out = output_tensor->flat_inner_dims<T>();
T* out_data = out.data();
memset(out_data, 0, fixed_length * flat_in.size());

This is followed by reencoding code:

for (int64 i = 0; i < flat_in.size(); ++i) {
  const T* in_data = reinterpret_cast<const T*>(flat_in(i).data());

  if (flat_in(i).size() > fixed_length) {
    memcpy(out_data, in_data, fixed_length);
  } else {
    memcpy(out_data, in_data, flat_in(i).size());
  }
  out_data += fixed_length;
}

The erroneous code is the last line above: it is moving the out_data pointer by fixed_length * sizeof(T) bytes whereas it only copied at most fixed_length bytes from the input. This results in parts of the input not being decoded into the output.

Furthermore, because the pointer advance is far wider than desired, this quickly leads to writing to outside the bounds of the backing data. This OOB write leads to interpreter crash in the reproducer mentioned here, but more severe attacks can be mounted too, given that this gadget allows writing to periodically placed locations in memory.

Patches

We have patched the issue in GitHub commit 698e01511f62a3c185754db78ebce0eee1f0184d.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

CVE-2021-29613

Impact

Incomplete validation in tf.raw_ops.CTCLoss allows an attacker to trigger an OOB read from heap:

import tensorflow as tf

inputs = tf.constant([], shape=[10, 16, 0], dtype=tf.float32)
labels_indices = tf.constant([], shape=[8, 0], dtype=tf.int64)
labels_values = tf.constant([-100] * 8, shape=[8], dtype=tf.int32)
sequence_length = tf.constant([-100] * 16, shape=[16], dtype=tf.int32)
  
tf.raw_ops.CTCLoss(inputs=inputs, labels_indices=labels_indices,
                   labels_values=labels_values, sequence_length=sequence_length,
                   preprocess_collapse_repeated=True, ctc_merge_repeated=False,
                   ignore_longer_outputs_than_inputs=True)

An attacker can also trigger a heap buffer overflow:

import tensorflow as tf

inputs = tf.constant([], shape=[7, 2, 0], dtype=tf.float32)
labels_indices = tf.constant([-100, -100], shape=[2, 1], dtype=tf.int64)
labels_values = tf.constant([-100, -100], shape=[2], dtype=tf.int32)
sequence_length = tf.constant([-100, -100], shape=[2], dtype=tf.int32)

tf.raw_ops.CTCLoss(inputs=inputs, labels_indices=labels_indices,
                   labels_values=labels_values, sequence_length=sequence_length,
                   preprocess_collapse_repeated=False, ctc_merge_repeated=False,
                   ignore_longer_outputs_than_inputs=False)

Finally, an attacker can trigger a null pointer dereference:

import tensorflow as tf

inputs = tf.constant([], shape=[0, 2, 11], dtype=tf.float32)
labels_indices = tf.constant([], shape=[0, 2], dtype=tf.int64)
labels_values = tf.constant([], shape=[0], dtype=tf.int32)
sequence_length = tf.constant([-100, -100], shape=[2], dtype=tf.int32)

tf.raw_ops.CTCLoss(inputs=inputs, labels_indices=labels_indices,
                   labels_values=labels_values, sequence_length=sequence_length,
                   preprocess_collapse_repeated=False, ctc_merge_repeated=False,
                   ignore_longer_outputs_than_inputs=False)

Patches

We have patched the issue in GitHub commit14607c0707040d775e06b6817325640cb4b5864c followed by GitHub commit 4504a081af71514bb1828048363e6540f797005b.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick these commits on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by Yakun Zhang and Ying Wang of Baidu X-Team.

CVE-2021-29612

Impact

An attacker can trigger a heap buffer overflow in Eigen implementation of tf.raw_ops.BandedTriangularSolve:

import tensorflow as tf
import numpy as np
  
matrix_array = np.array([])
matrix_tensor = tf.convert_to_tensor(np.reshape(matrix_array,(0,1)),dtype=tf.float32)
rhs_array = np.array([1,1])
rhs_tensor = tf.convert_to_tensor(np.reshape(rhs_array,(1,2)),dtype=tf.float32)
tf.raw_ops.BandedTriangularSolve(matrix=matrix_tensor,rhs=rhs_tensor)

The implementation calls ValidateInputTensors for input validation but fails to validate that the two tensors are not empty:

void ValidateInputTensors(OpKernelContext* ctx, const Tensor& in0, const Tensor& in1) {
  OP_REQUIRES(
      ctx, in0.dims() >= 2, 
      errors::InvalidArgument("In[0] ndims must be >= 2: ", in0.dims()));

  OP_REQUIRES(
      ctx, in1.dims() >= 2,
      errors::InvalidArgument("In[1] ndims must be >= 2: ", in1.dims()));
}

Furthermore, since OP_REQUIRES macro only stops execution of current function after setting ctx->status() to a non-OK value, callers of helper functions that use OP_REQUIRES must check value of ctx->status() before continuing. This doesn't happen in this op's implementation, hence the validation that is present is also not effective.

Patches

We have patched the issue in GitHub commit ba6822bd7b7324ba201a28b2f278c29a98edbef2 followed by GitHub commit 0ab290774f91a23bebe30a358fde4e53ab4876a0.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by Ye Zhang and Yakun Zhang of Baidu X-Team.

CVE-2021-29610

Impact

The validation in tf.raw_ops.QuantizeAndDequantizeV2 allows invalid values for axis argument:

import tensorflow as tf

input_tensor = tf.constant([0.0], shape=[1], dtype=float)
input_min = tf.constant(-10.0)
input_max = tf.constant(-10.0)

tf.raw_ops.QuantizeAndDequantizeV2(
  input=input_tensor, input_min=input_min, input_max=input_max,
  signed_input=False, num_bits=1, range_given=False, round_mode='HALF_TO_EVEN',
  narrow_range=False, axis=-2)

The validation uses || to mix two different conditions:

OP_REQUIRES(ctx,
  (axis_ == -1 || axis_ < input.shape().dims()),
  errors::InvalidArgument(...));

If axis_ < -1 the condition in OP_REQUIRES will still be true, but this value of axis_ results in heap underflow. This allows attackers to read/write to other data on the heap.

Patches

We have patched the issue in GitHub commit c5b0d5f8ac19888e46ca14b0e27562e7fbbee9a9.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by Yakun Zhang and Ying Wang of Baidu X-Team.

CVE-2021-29609

Impact

Incomplete validation in SparseAdd results in allowing attackers to exploit undefined behavior (dereferencing null pointers) as well as write outside of bounds of heap allocated data:

import tensorflow as tf

a_indices = tf.zeros([10, 97], dtype=tf.int64)
a_values = tf.zeros([10], dtype=tf.int64)
a_shape = tf.zeros([0], dtype=tf.int64)

b_indices = tf.zeros([0, 0], dtype=tf.int64)
b_values = tf.zeros([0], dtype=tf.int64)
b_shape = tf.zeros([0], dtype=tf.int64)
  
thresh = 0

tf.raw_ops.SparseAdd(a_indices=a_indices,
                    a_values=a_values,
                    a_shape=a_shape,
                    b_indices=b_indices,
                    b_values=b_values,
                    b_shape=b_shape,
                    thresh=thresh)

The implementation has a large set of validation for the two sparse tensor inputs (6 tensors in total), but does not validate that the tensors are not empty or that the second dimension of *_indices matches the size of corresponding *_shape. This allows attackers to send tensor triples that represent invalid sparse tensors to abuse code assumptions that are not protected by validation.

Patches

We have patched the issue in GitHub commit 6fd02f44810754ae7481838b6a67c5df7f909ca3 followed by GitHub commit 41727ff06111117bdf86b37db198217fd7a143cc.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by Yakun Zhang and Ying Wang of Baidu X-Team.

CVE-2021-29608

Impact

Due to lack of validation in tf.raw_ops.RaggedTensorToTensor, an attacker can exploit an undefined behavior if input arguments are empty:

import tensorflow as tf

shape = tf.constant([-1, -1], shape=[2], dtype=tf.int64)
values = tf.constant([], shape=[0], dtype=tf.int64)
default_value = tf.constant(404, dtype=tf.int64)
row = tf.constant([269, 404, 0, 0, 0, 0, 0], shape=[7], dtype=tf.int64)
rows = [row]
types = ['ROW_SPLITS']

tf.raw_ops.RaggedTensorToTensor(
  shape=shape, values=values, default_value=default_value, 
  row_partition_tensors=rows, row_partition_types=types)

The implementation only checks that one of the tensors is not empty, but does not check for the other ones.

There are multiple DCHECK validations to prevent heap OOB, but these are no-op in release builds, hence they don't prevent anything.

Patches

We have patched the issue in GitHub commit b761c9b652af2107cfbc33efd19be0ce41daa33e followed by GitHub commit f94ef358bb3e91d517446454edff6535bcfe8e4a and GitHub commit c4d7afb6a5986b04505aca4466ae1951686c80f6.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick these commits on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by Yakun Zhang and Ying Wang of Baidu X-Team.

CVE-2021-29595

Impact

The implementation of the DepthToSpace TFLite operator is vulnerable to a division by zero error:

const int block_size = params->block_size;
...
const int input_channels = input->dims->data[3];
... 
int output_channels = input_channels / block_size / block_size;

An attacker can craft a model such that params->block_size is 0.

Patches

We have patched the issue in GitHub commit 106d8f4fb89335a2c52d7c895b7a7485465ca8d9.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by members of the Aivul Team from Qihoo 360.

CVE-2021-29584

Impact

An attacker can trigger a denial of service via a CHECK-fail in caused by an integer overflow in constructing a new tensor shape:

import tensorflow as tf

input_layer = 2**60-1
sparse_data = tf.raw_ops.SparseSplit(
    split_dim=1, 
    indices=[(0, 0), (0, 1), (0, 2), 
    (4, 3), (5, 0), (5, 1)],
    values=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
    shape=(input_layer, input_layer),
    num_split=2,
    name=None
    )

This is because the implementation builds a dense shape without checking that the dimensions would not result in overflow:

sparse::SparseTensor sparse_tensor;
OP_REQUIRES_OK(context,
               sparse::SparseTensor::Create(
                 input_indices, input_values,
                 TensorShape(input_shape.vec<int64>()), &sparse_tensor));

The TensorShape constructor uses a CHECK operation which triggers when InitDims returns a non-OK status.

template <class Shape>
TensorShapeBase<Shape>::TensorShapeBase(gtl::ArraySlice<int64> dim_sizes) {
  set_tag(REP16);
  set_data_type(DT_INVALID);
  TF_CHECK_OK(InitDims(dim_sizes));
}

In our scenario, this occurs when adding a dimension from the argument results in overflow:

template <class Shape>
Status TensorShapeBase<Shape>::InitDims(gtl::ArraySlice<int64> dim_sizes) {
  ...
  Status status = Status::OK();
  for (int64 s : dim_sizes) {
    status.Update(AddDimWithStatus(internal::SubtleMustCopy(s)));
    if (!status.ok()) {
      return status;
    }
  }
}

template <class Shape>
Status TensorShapeBase<Shape>::AddDimWithStatus(int64 size) {
  ...
  int64 new_num_elements;
  if (kIsPartial && (num_elements() < 0 || size < 0)) {
    new_num_elements = -1;
  } else {
    new_num_elements = MultiplyWithoutOverflow(num_elements(), size);
    if (TF_PREDICT_FALSE(new_num_elements < 0)) {
        return errors::Internal("Encountered overflow when multiplying ",
                                num_elements(), " with ", size,
                                ", result: ", new_num_elements);
      }
  }
  ...
}

This is a legacy implementation of the constructor and operations should use BuildTensorShapeBase or AddDimWithStatus to prevent CHECK-failures in the presence of overflows.

Patches

We have patched the issue in GitHub commit 4c0ee937c0f61c4fc5f5d32d9bb4c67428012a60.

The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

For more information

Please consult our security guide for more information regarding the security model and how to contact us with issues and questions.

Attribution

This vulnerability has been reported by researchers from University of Virginia and University of California, Santa Barbara.

CVE-2021-29583

Impact

The implementation of tf.raw_ops.FusedBatchNorm is vulnerable to a heap buffer overflow:

import tensorflow as tf

x = tf.zeros([10, 10, 10, 6], dtype=tf.float32)
scale = tf.constant([0.0], shape=[1], dtype=tf.float32)
offset = tf.constant([0.0], shape=[1], dtype=tf.float32)
mean = tf.constant([0.0], shape=[1], dtype=tf.float32)
variance = tf.constant([0.0], shape=[1], dtype=tf.float32)
epsilon = 0.0
exponential_avg_factor = 0.0
data_format = "NHWC"
is_training = False
    
tf.raw_ops.FusedBatchNorm(
  x=x, scale=scale, offset=offset, mean=mean, variance=variance,
  epsilon=epsilon, exponential_avg_factor=exponential_avg_factor,
  data_format=data_format, is_training=is_training)

If the tensors are empty, the same implementation can trigger undefined behavior by dereferencing null pointers:

import tensorflow as tf
import numpy as np

x = tf.zeros([10, 10, 10, 1], dtype=tf.float32)
scale = tf.constant([], shape=[0], dtype=tf.float32)
offset = tf.constant([], shape=[0], dtype=tf.float32)
mean = tf.constant([], shape=[0], dtype=tf.float32)
variance = tf.constant([], shape=[0], dtype=tf.float32)
epsilon = 0.0
exponential_avg_factor = 0.0
data_format = "NHWC"
is_training = False

tf.raw_ops.FusedBatchNorm(
  x=x, scale=scale, offset=offset, mean=mean, variance=variance, 
  epsilon=epsilon, exponential_avg_factor=exponential_avg_factor,
  data_format=

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants