fixing various errors on the file classification_with_grn_and_vsn #2011

Humbulani1234 · 2024-12-22T00:45:42Z

Dataset preparation errors

The example file from structured_data classification_with_grn_and vsn.py I think it is using the wrong dataset, i.e., the data_url: https://archive.ics.uci.edu/static/public/20/census+income.zip leads to a download of an incorrect dataset. The correct data_url, I believe should be: https://archive.ics.uci.edu/static/public/117/census+income+kdd.zip

To extract the downloaded .tar.gz file, created during keras.utils.get_file, a fix has been added.

A fix was also added to clean up the directory that the files where extracted to during download in order to run the script again without errors:

Additionally, the original script has the code snippet:

train_data_path = os.path.join(
    os.path.expanduser("~"), ".keras", "datasets", "adult.data"
)
test_data_path = os.path.join(
    os.path.expanduser("~"), ".keras", "datasets", "adult.test"
)

The above snippet doesn't account for the directory created during keras.utils.get_file extraction process census+income+kdd.zip which leads to an incorrect path for both train_data_path and test_data_path, and a fix has been added.

Additional training errors

After covering the above dataset's preparation process, the script also has an additional error encountered during model training, detailed below and an attempted solution provided:

2024-12-19 21:02:15.350619: W tensorflow/core/framework/op_kernel.cc:1816] OP_REQUIRES failed at cast_op.cc:122 : UNIMPLEMENTED: Cast string to float is not supported
2024-12-19 21:02:15.350683: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: UNIMPLEMENTED: Cast string to float is not supported
Traceback (most recent call last):
  File "/home/humbulani/tensorflow-env/keras-io-master/examples/structured_data/classification_with_grn_and_vsn.py", line 513, in <module>
    model.fit(
  File "/home/humbulani/tensorflow-env/env/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/humbulani/tensorflow-env/env/lib/python3.11/site-packages/tensorflow/python/framework/ops.py", line 5983, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.UnimplementedError: Exception encountered when calling Functional.call().

{{function_node __wrapped__Cast_device_/job:localhost/replica:0/task:0/device:CPU:0}} Cast string to float is not supported [Op:Cast] name:

Attempted solution:

I believe I have precisely traced the error to the following, here is a pdb script:

> /home/humbulani/tensorflow-env/env/lib/python3.11/site-packages/keras/src/models/functional.py(245)_convert_inputs_to_tensors()
-> converted = []
(Pdb) p self._inputs
[<KerasTensor shape=(None,), dtype=float32, sparse=False, name=age>, <KerasTensor shape=(None,), dtype=float32, sparse=False, name=capital_gains>, <KerasTensor shape=(None,), dtype=float32, sparse=False, name=capital_losses>, ...]

(Pdb) p flat_inputs
[<tf.Tensor: shape=(265,), dtype=float32, numpy=
array([63., 52.,  2., 45.,  0., 43., 67., 26., 29., 53., 31., 59., 57.,...>, <tf.Tensor: shape=(265,), dtype=string, numpy=
array([b' Not in universe', b' Private', b' Not in universe', b' Private',
       b' Not in universe', b' Private', b' Not in universe',...>...]

The function _convert_inputs_to_tensors creates a zip iterator pairing together flat_inputs and self._inputs, and as per the pdb output above the first element (age) from flat_inputs and self._inputs has float32 dtype, however the second element (capital_gains) has a float32 dtype and a string dtype causing the discrepancy, and hence the error.

The main issue is that inputs datatype to the method Functional.call is a OrderedDict and in the function _standardize_inputs the line flat_inputs = tree.flatten(inputs) is not actually ordering/sorting the OrderedDict as per doc for the function tree.flatten. This contributes to the mismatch between self._inputs, the models inputs, and flat_inputs. Hence a fix has been provided in the script function process to convert features to dict.

Fix provided, and I think tree.flatten functionality must be assessed and rectified.

Environment

Tensorflow == 2.16.2
Keras == 3.7.0
Python == 3.11.10

…sn.py

Humbulani1234 · 2025-01-03T16:27:14Z

Find generated .ipynb and .md files.

Humbulani1234 · 2025-01-08T11:20:48Z

Will resend

fixing various errors on the file classification_with_grn_and_vsn

b2de509

github-actions bot assigned sachinprasadhs Dec 22, 2024

Humbulani1234 mentioned this pull request Dec 22, 2024

Fixing various errors in the structured_data files #2013

Merged

generating .ipynb and .md files for the classification_with_grn_and_v…

d34c37f

…sn.py

Humbulani1234 closed this Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing various errors on the file classification_with_grn_and_vsn #2011

fixing various errors on the file classification_with_grn_and_vsn #2011

Humbulani1234 commented Dec 22, 2024

Humbulani1234 commented Jan 3, 2025

Humbulani1234 commented Jan 8, 2025

fixing various errors on the file classification_with_grn_and_vsn #2011

fixing various errors on the file classification_with_grn_and_vsn #2011

Conversation

Humbulani1234 commented Dec 22, 2024

Dataset preparation errors

Additional training errors

Environment

Humbulani1234 commented Jan 3, 2025

Humbulani1234 commented Jan 8, 2025