-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor skeleton.to_json
to serialize skeleton object without jsonpickle
#1934
Conversation
WalkthroughThe changes involve modifying the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (2 hunks)
Files skipped from review as they are similar to previous changes (1)
- sleap/skeleton.py
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #1934 +/- ##
===========================================
+ Coverage 73.30% 75.33% +2.02%
===========================================
Files 134 133 -1
Lines 24087 24476 +389
===========================================
+ Hits 17658 18438 +780
+ Misses 6429 6038 -391 ☔ View full report in Codecov by Sentry. |
Relevant sleap-io portion: ^ That is 100% self-contained and can probably be ported over directly. A nicer (if more onerous) approach would be to finally include sleap-io as a dependency and just deep import that exact function and nothing else (for this PR). For example, in SLEAP's def to_json(self, ...):
from sleap_io.io.slp import serialize_skeletons
skeleton_dicts = serialize_skeletons([self])
... This should probably happen at a higher level though and just refactor the serialization logic. Also note a couple of gotchas with the sleap-io implementation: |
…nd use `json.dumps` instead of `jsonpickle`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (2 hunks)
Additional context used
Ruff
sleap/skeleton.py
1009-1009: Local variable
indexed_node_graph
is assigned to but never usedRemove assignment to unused variable
indexed_node_graph
(F841)
Additional comments not posted (4)
sleap/skeleton.py (4)
Line range hint
990-1000
: LGTM!The docstring update for the new
node_to_idx
parameter looks good.
1011-1075
: Approved the refactored serialization logic.The code changes for constructing node and edge dictionaries to serialize the skeleton look good. The code is well-documented and aligns with the PR objective of refactoring the
to_json
method to leverage the serialization logic fromsleap-io
. The resulting JSON representation is more structured and eliminates the reliance onjsonpickle
.
1077-1119
: Approved the final skeleton dictionary construction.The code changes for creating the final skeleton dictionary using the node and edge dictionaries look good. The resulting JSON representation is structured and eliminates the reliance on
jsonpickle
. The previous serialization logic has been appropriately commented out.
1095-1100
: Remove the unused variableindexed_node_graph
.The Ruff static analysis tool has identified that the variable
indexed_node_graph
is assigned but never used. Since the code block is commented out and the variable is not used elsewhere, it can be safely removed.Apply this diff to remove the unused variable:
- # if node_to_idx is not None: - # indexed_node_graph = nx.relabel_nodes( - # G=self._graph, mapping=node_to_idx - # ) # map nodes to int - # else: - # indexed_node_graph = self._graphLikely invalid or redundant comment.
sleap/skeleton.py
Outdated
# TODO: Replace jsonpickle with a custom encoder from sleap-io. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acknowledged the TODO for replacing jsonpickle
.
The TODO comment is a good reminder for a future enhancement.
Let me know if you need any help with implementing the custom encoder from sleap-io
to replace jsonpickle
. I can assist with the implementation or open a GitHub issue to track this task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (1)
sleap/skeleton.py (1)
1075-1104
: Consider removing the oldto_json
implementation in the future.It's a good practice to keep the old implementation as a reference during the transition to the new implementation. However, once the new implementation is stable and well-tested, consider removing the commented-out code to improve code readability and maintainability.
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (2 hunks)
Additional comments not posted (4)
sleap/skeleton.py (4)
Line range hint
1-1
:
Line range hint
990-998
: LGTM!The docstring accurately describes the
node_to_idx
parameter.
1002-1071
: Excellent work on the newto_json
implementation!The new implementation follows the logic from
sleap-io
and constructs a more structured JSON representation of the skeleton. It correctly handles both body edges and symmetry edges by creating separate node and edge dictionaries.The code changes are approved.
1075-1104
: [consistent_with_summary]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (2 hunks)
Additional comments not posted (1)
sleap/skeleton.py (1)
Line range hint
990-1076
: Review of theto_json
method changesThe refactoring of the
to_json
method introduces a manual approach to JSON serialization, which is a significant change from the previousjsonpickle
-based method. Here are some observations and suggestions:
Correctness: The method constructs
nodes_dicts
andedges_dicts
correctly based on the provided nodes and edges. However, the use ofPerformance: The method iterates over all nodes and edges, which is expected. However, ensure that the
node_to_idx
mapping and the handling of edge types are optimized for large datasets.Maintainability: The method uses hardcoded values for
edge_type
(lines 1021-1024, 1045-1047), which could be abstracted or managed in a more flexible way to accommodate changes in edge types.Suggestion: Consider adding error handling for cases where
node_to_idx
might not contain all nodes present in the skeleton, which could lead toKeyError
.Documentation: Update the method's docstring to reflect the new logic and parameters used in the method.
Overall, the changes align with the goal of removing
jsonpickle
dependency and manually handling JSON serialization. Ensure comprehensive testing to validate the serialization format and compatibility with existing systems.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Outside diff range, codebase verification and nitpick comments (7)
sleap/skeleton.py (7)
1002-1020
: Remove debug print statements.The debug print statements at lines 1007, 1017, and 1019 should be removed before merging the code.
Apply this diff to remove the debug print statements:
- print(f'node: {node}') # Note: This ID is not the same as the node index in the skeleton in # legacy SLEAP, but we do not retain this information in the labels, so # IDs will be different. # # The weight is also kept fixed here, but technically this is not # modified or used in legacy SLEAP either. # # TODO: Store legacy metadata in labels to get byte-level compatibility? node_to_id[node] = len(node_to_id) - print(f'node_to_id: {node_to_id}') nodes_dicts.append({"name": node.name, "weight": 1.0}) - print(f'nodes_dicts: {nodes_dicts}')
1021-1059
: Remove debug print statements.The debug print statements at lines 1024-1026, 1033, 1036, 1042-1043, 1045-1046, and 1058 should be removed before merging the code.
Apply this diff to remove the debug print statements:
for edge_ind, edge in enumerate(self.edges): - print(f'edge_ind: {edge_ind}') - print(f'edge: {edge}') if edge_ind == 0: edge_type = { "py/reduce": [ {"py/type": "sleap.skeleton.EdgeType"}, {"py/tuple": [1]}, # 1 = real edge, 2 = symmetry edge ] } - print(f'edge_type: {edge_type}') else: edge_type = {"py/id": 1} - print(f'edge_type: {edge_type}') # Edges are stored as a list of tuples of nodes # The source and target are the nodes in the tuple (edge) are the first and # second nodes respectively source = edge[0] - print(f'source: {source}') - print(f'node_to_id[source]: {node_to_id[source]}') target = edge[1] - print(f'target: {target}') - print(f'node_to_id[target]: {node_to_id[target]}') edges_dicts.append( { # Note: Insert idx is not the same as the edge index in the skeleton # in legacy SLEAP. "edge_insert_idx": edge_ind, "key": 0, # Always 0. "source": {"py/id": node_to_id[source]}, "target": {"py/id": node_to_id[target]}, "type": edge_type, } ) - print(f'edges_dicts: {edges_dicts}')
1060-1084
: Remove debug print statements.The debug print statements at lines 1062-1063, 1075-1076 should be removed before merging the code.
Apply this diff to remove the debug print statements:
for symmetry_ind, symmetry in enumerate(self.symmetries): - print(f'symmetry_ind: {symmetry_ind}') - print(f'symmetry: {symmetry}') if symmetry_ind == 0: edge_type = { "py/reduce": [ {"py/type": "sleap.skeleton.EdgeType"}, {"py/tuple": [2]}, # 1 = real edge, 2 = symmetry edge ] } else: edge_type = {"py/id": 2} src, dst = tuple(symmetry.nodes) - print(f'src: {src}') - print(f'dst: {dst}') edges_dicts.append( { "key": 0, "source": {"py/id": node_to_id[src]}, "target": {"py/id": node_to_id[dst]}, "type": edge_type, } )
1113-1120
: Remove commented-out code.The commented-out code related to encoding the skeleton using
jsonpickle
is no longer needed and should be removed to improve code readability and maintainability.Apply this diff to remove the commented-out code:
- # jsonpickle.set_encoder_options("simplejson", sort_keys=True, indent=4) - # if node_to_idx is not None: - # indexed_node_graph = nx.relabel_nodes( - # G=self._graph, mapping=node_to_idx - # ) # map nodes to int - # else: - # indexed_node_graph = self._graph
1121-1123
: Remove commented-out code.The commented-out code related to encoding the graph using
json_graph
is no longer needed and should be removed to improve code readability and maintainability.Apply this diff to remove the commented-out code:
- # # Encode to JSON - # graph = json_graph.node_link_data(indexed_node_graph)
1124-1136
: Remove commented-out code.The commented-out code related to creating a data dictionary based on the skeleton's template status is no longer needed and should be removed to improve code readability and maintainability.
Apply this diff to remove the commented-out code:
- # # SLEAP v1.3.0 added `description` and `preview_image` to `Skeleton`, but saving - # # these fields breaks data format compatibility. Currently, these are only - # # added in our custom template skeletons. To ensure backwards data format - # # compatibilty of user data, we only save these fields if they are not None. - # if self.is_template: - # data = { - # "nx_graph": graph, - # "description": self.description, - # "preview_image": self.preview_image, - # } - # else: - # data = graph
1137-1139
: Remove commented-out code.The commented-out code related to encoding the data using
jsonpickle
and returning the JSON string is no longer needed and should be removed to improve code readability and maintainability.Apply this diff to remove the commented-out code:
- # json_str = jsonpickle.encode(data) - - # return json_str
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (3 hunks)
Additional context used
Ruff
sleap/skeleton.py
1102-1102: Local variable
skeleton_dict
is assigned to but never usedRemove assignment to unused variable
skeleton_dict
(F841)
sleap/skeleton.py
Outdated
# Create skeleton dict. | ||
if self.is_template: | ||
data = { | ||
"nx_graph": graph, | ||
skeleton_dict = { | ||
"directed": True, | ||
"graph": { | ||
"name": self.name, | ||
"num_edges_inserted": len(self.edges), | ||
}, | ||
"links": edges_dicts, | ||
"multigraph": True, | ||
# In the order in Skeleton.nodes and must match up with nodes_dicts. | ||
"nodes": [{"id": {"py/id": node_to_id[node]}} for node in self.nodes], | ||
"description": self.description, | ||
"preview_image": self.preview_image, | ||
} | ||
else: | ||
data = graph | ||
|
||
json_str = jsonpickle.encode(data) | ||
|
||
return json_str | ||
skeleton_dict ={ | ||
"directed": True, | ||
"graph": { | ||
"name": self.name, | ||
"num_edges_inserted": len(self.edges), | ||
}, | ||
"links": edges_dicts, | ||
"multigraph": True, | ||
# In the order in Skeleton.nodes and must match up with nodes_dicts. | ||
"nodes": [{"id": {"py/id": node_to_id[node]}} for node in self.nodes],} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address the unused variable skeleton_dict
.
The static analysis tool Ruff has correctly identified that the skeleton_dict
variable is assigned but never used. This is a valid concern that should be addressed.
Consider the following options to address the unused variable:
-
If the
skeleton_dict
is intended to be used later in the code, ensure that it is actually used. If not, remove the assignment. -
If the
skeleton_dict
is not needed, remove the entire code block that creates it.
Please let me know if you need any assistance with addressing this issue.
Tools
Ruff
1102-1102: Local variable
skeleton_dict
is assigned to but never usedRemove assignment to unused variable
skeleton_dict
(F841)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Outside diff range and nitpick comments (1)
sleap/skeleton.py (1)
1116-1143
: Remove the commented out code.The commented out code at line range 1116-1143 is the old implementation that used
jsonpickle
for encoding. It has been replaced by the new implementation that manually serializes the skeleton data to JSON.Consider removing the commented out code to improve code readability and maintainability.
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (3 hunks)
Additional comments not posted (1)
sleap/skeleton.py (1)
1090-1110
: LGTM!The code changes for creating the
skeleton_dict
structure based onself.is_template
are approved.
sleap/skeleton.py
Outdated
edges_dicts.append( | ||
{ | ||
# Note: Insert idx is not the same as the edge index in the skeleton | ||
# in legacy SLEAP. | ||
"edge_insert_idx": edge_ind, | ||
"key": 0, # Always 0. | ||
"source": {"py/object": "sleap.skeleton.Node", "py/state": {"name": source.name, "weight": 1.0}}, | ||
"target": {"py/object": "sleap.skeleton.Node", "py/state": {"name": target.name, "weight": 1.0}}, | ||
# "target": {"py/id": node_to_id[target]}, | ||
"type": edge_type, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about doing something like:
edges_dicts.append( | |
{ | |
# Note: Insert idx is not the same as the edge index in the skeleton | |
# in legacy SLEAP. | |
"edge_insert_idx": edge_ind, | |
"key": 0, # Always 0. | |
"source": {"py/object": "sleap.skeleton.Node", "py/state": {"name": source.name, "weight": 1.0}}, | |
"target": {"py/object": "sleap.skeleton.Node", "py/state": {"name": target.name, "weight": 1.0}}, | |
# "target": {"py/id": node_to_id[target]}, | |
"type": edge_type, | |
} | |
if source not in node_to_id: | |
source_dict = {"py/object": "sleap.skeleton.Node", "py/state": {"py/tuple": [source.name, source.weight]}} | |
node_to_id[source] = len(node_to_id) | |
else: | |
source_dict = {"py/id": node_to_id[source]} | |
if target not in node_to_id: | |
target_dict = {"py/object": "sleap.skeleton.Node", "py/state": {"py/tuple": [target.name, target.weight]}} | |
node_to_id[target] = len(node_to_id) | |
else: | |
target_dict = {"py/id": node_to_id[target]} | |
edges_dicts.append( | |
{ | |
# Note: Insert idx is not the same as the edge index in the skeleton | |
# in legacy SLEAP. | |
"edge_insert_idx": edge_ind, | |
"key": 0, # Always 0. | |
"source": source_dict, | |
"target": target_dict, | |
"type": edge_type, | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try moving the "nodes"
list above the "links"
dictionary and see what happens
sleap/skeleton.py
Outdated
"links": edges_dicts, | ||
"multigraph": True, | ||
# In the order in Skeleton.nodes and must match up with nodes_dicts. | ||
"nodes": [{"id": {"py/id": node_to_id[node]}} for node in self.nodes], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming jsonpickle is decoding by reading things in order, we expect that all Node
s will only appear in the "nodes"
list once. Therefore, we can (hopefully) move the "nodes"
list before the "links"
dictionary. The "nodes"
list would need to define the nodes in the {py/object: ..., py/state: ...}
dictionary. Then, the "links"
dictionary should just reference nodes by "py/id"
(referencing the Node
's index in the "nodes"
list).
"links": edges_dicts, | |
"multigraph": True, | |
# In the order in Skeleton.nodes and must match up with nodes_dicts. | |
"nodes": [{"id": {"py/id": node_to_id[node]}} for node in self.nodes], | |
"nodes": [{"id": {"py/object": "sleap.skeleton.Node", "py/state": {"name": node.name, "weight": node.weight}}} for node in self.nodes], | |
"links": edges_dicts, | |
"multigraph": True, |
sleap/skeleton.py
Outdated
"links": edges_dicts, | ||
"multigraph": True, | ||
# In the order in Skeleton.nodes and must match up with nodes_dicts. | ||
"nodes": [{"id": {"py/id": node_to_id[node]}} for node in self.nodes],} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming jsonpickle is decoding by reading things in order, we can assume that all Node
s will only appear in the "nodes"
list once. Therefore, we can (hopefully) move the "nodes"
list before the "links"
dictionary. The "nodes"
list would need to define the nodes in the {py/object: ..., py/state: ...}
dictionary. Then, the "links"
dictionary should just reference nodes by "py/id"
(referencing the Node
's index in the "nodes"
list).
"links": edges_dicts, | |
"multigraph": True, | |
# In the order in Skeleton.nodes and must match up with nodes_dicts. | |
"nodes": [{"id": {"py/id": node_to_id[node]}} for node in self.nodes],} | |
"nodes": [{"id": {"py/object": "sleap.skeleton.Node", "py/state": {"name": node.name, "weight": node.weight}}} for node in self.nodes], | |
"links": edges_dicts, | |
"multigraph": True, |
sleap/skeleton.py
Outdated
"source": {"py/object": "sleap.skeleton.Node", "py/state": {"name": source.name, "weight": 1.0}}, | ||
"target": {"py/object": "sleap.skeleton.Node", "py/state": {"name": target.name, "weight": 1.0}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are moving the "nodes"
list above the "links"
dictionary, then we should be able to just reference the Node
's "py/id"
here (i.e. their index in that "nodes"
list which is also the Node
's index in Skeleton.nodes
)
"source": {"py/object": "sleap.skeleton.Node", "py/state": {"name": source.name, "weight": 1.0}}, | |
"target": {"py/object": "sleap.skeleton.Node", "py/state": {"name": target.name, "weight": 1.0}}, | |
"source": {"py/id": self.nodes.index(node)}, | |
"target": {"py/id": self.nodes.index(node)}, |
From group discussion:
Notes on "tuple" vs "dict" formatWhen pickling an Current versions of Our plan is to manually handle structuring/unstructuring rather than relying on For reference, for a node = sleap.skeleton.Node(name="a", weight=1.0) The old format ("tuple") unstructures this into dictionaries like: {"py/object": "sleap.skeleton.Node", "py/state": {"py/tuple": ["a", 1.0]}} Whereas the new format ("dict") unstructures this into dictionaries like: {"py/object": "sleap.skeleton.Node", "py/state": {"name": "a", "weight": 1.0}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (2 hunks)
Additional comments not posted (4)
sleap/skeleton.py (4)
Line range hint
990-1169
: The refactoredto_json
method aligns with the PR objective.The changes made to the
to_json
method eliminate the reliance onjsonpickle
and provide a more controlled and structured approach to serializing the skeleton object into a JSON string. The manual construction of node and edge dictionaries allows for better customization of the resulting JSON representation.Suggestions:
- Consider removing or commenting out the unused code from the previous implementation (lines 1142-1168) to improve code cleanliness and readability.
1021-1065
: LGTM! The edge dictionary construction looks good.The code segment correctly constructs the list of edge dictionaries (
edges_dicts
) based on the skeleton's edges. The inclusion of relevant information such as the edge index, source and target nodes, and edge type ensures a comprehensive representation of the edges in the JSON format.The comments provided are helpful in explaining the differences in edge indexing compared to legacy SLEAP, which adds clarity to the implementation.
1067-1092
: The symmetry edge dictionary construction is implemented correctly.The code segment properly constructs the edge dictionaries for the symmetry edges in the skeleton. The inclusion of the source and target nodes using their IDs from the
node_to_id
mapping ensures consistent referencing of nodes in the JSON representation.The determination of the edge type based on the index of the symmetry edge is a valid approach.
1093-1139
: The construction of the final skeleton dictionary and JSON conversion is implemented properly.The code segment correctly constructs the final skeleton dictionary (
skeleton_dict
) based on whether the skeleton is a template or not. The inclusion of additional fields such asdescription
andpreview_image
for template skeletons ensures that the necessary information is preserved in the JSON representation.The incorporation of the graph information, nodes, and edges in the skeleton dictionary provides a complete representation of the skeleton.
The use of
json.dumps()
to convert the skeleton dictionary to a JSON string follows standard practices and is an appropriate choice.
# TODO: Store legacy metadata in labels to get byte-level compatibility? | ||
node_to_id[node] = len(node_to_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acknowledged the TODO for storing legacy metadata.
The TODO comment is a good reminder for a future enhancement to improve compatibility with legacy data.
Let me know if you need any help with implementing the storage of legacy metadata in labels to achieve byte-level compatibility. I can assist with the implementation or open a GitHub issue to track this task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (2)
sleap/skeleton.py (2)
1068-1092
: Add a comment to clarify theedge_type
handling.The code for constructing the
edges_dicts
list for symmetry edges looks good. However, consider adding a comment to explain why theedge_type
is set differently for the first symmetry edge (lines 1072-1077) compared to the rest (lines 1078-1079). This will improve code readability and maintainability.
1142-1169
: Remove the commented out code.The commented out code that uses
jsonpickle
for serialization is no longer used and can be safely removed. This will improve code cleanliness and readability.Apply this diff to remove the unused code:
- # jsonpickle.set_encoder_options("simplejson", sort_keys=True, indent=4) - # if node_to_idx is not None: - # indexed_node_graph = nx.relabel_nodes( - # G=self._graph, mapping=node_to_idx - # ) # map nodes to int - # else: - # indexed_node_graph = self._graph - - # # Encode to JSON - # graph = json_graph.node_link_data(indexed_node_graph) - - # # SLEAP v1.3.0 added `description` and `preview_image` to `Skeleton`, but saving - # # these fields breaks data format compatibility. Currently, these are only - # # added in our custom template skeletons. To ensure backwards data format - # # compatibilty of user data, we only save these fields if they are not None. - # if self.is_template: - # data = { - # "nx_graph": graph, - # "description": self.description, - # "preview_image": self.preview_image, - # } - # else: - # data = graph - - # json_str = jsonpickle.encode(data) - - # return json_str -
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (2 hunks)
Additional comments not posted (5)
sleap/skeleton.py (5)
1002-1019
: LGTM!The logic for building the
nodes_dicts
list looks good. Thenode_to_id
mapping ensures unique IDs are assigned to nodes.
1021-1065
: Looks good!The code for constructing the
edges_dicts
list for normal edges is implemented correctly. The comments help explain the purpose of theedge_insert_idx
andkey
fields.
1093-1135
: LGTM!The code for creating the
skeleton_dict
is implemented correctly. It properly combines thenodes_dicts
andedges_dicts
lists and handles the case when the skeleton is a template by including thedescription
andpreview_image
fields.
1137-1139
: Looks good!The serialization of the
skeleton_dict
to a JSON string usingjson.dumps()
is implemented correctly. Theindent
andsort_keys
parameters ensure a nicely formatted output.
Line range hint
990-1169
: Significant refactoring with improved serialization logic.The
to_json
function has undergone a major refactoring to replace the usage ofjsonpickle
with a custom serialization approach. The new implementation provides more control over the JSON structure and handles various scenarios correctly.The custom serialization logic builds the
nodes_dicts
andedges_dicts
lists to represent the skeleton nodes and edges respectively. The resultingskeleton_dict
is then serialized to a JSON string usingjson.dumps()
.The commented out code that uses
jsonpickle
is no longer needed and can be safely removed to improve code cleanliness and maintainability.Overall, the refactored code is a significant improvement over the previous implementation and enhances the serialization process of the skeleton object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (6)
sleap/skeleton.py (6)
1015-1016
: Acknowledged the TODO for storing legacy metadata.The TODO comment is a good reminder for a future enhancement to improve compatibility with legacy data.
Let me know if you need any help with implementing the storage of legacy metadata in labels to achieve byte-level compatibility. I can assist with the implementation or open a GitHub issue to track this task.
1007-1007
: Remove print statements used for debugging.The print statements at lines 1007, 1017, and 1019 appear to be used for debugging purposes. It's a good practice to remove such statements before merging the code to keep the codebase clean.
Consider removing these print statements:
- print(f"node: {node}")
- print(f"node_to_id: {node_to_id}")
- print(f"nodes_dicts: {nodes_dicts}")
Also applies to: 1017-1017, 1019-1019
1024-1025
: Remove print statements used for debugging.The print statements at lines 1024-1025, 1033, 1036, 1042-1043, 1045-1046, and 1064 appear to be used for debugging purposes. It's a good practice to remove such statements before merging the code to keep the codebase clean.
Consider removing these print statements:
- print(f"edge_ind: {edge_ind}") - print(f"edge: {edge}")- print(f"edge_type: {edge_type}")
- print(f"edge_type: {edge_type}")
- print(f"source: {source}") - print(f"node_to_id[source]: {node_to_id[source]}")- print(f"target: {target}") - print(f"node_to_id[target]: {node_to_id[target]}")- print(f"edges_dicts: {edges_dicts}")
Also applies to: 1033-1033, 1036-1036, 1042-1043, 1045-1046, 1064-1064
1068-1069
: Remove print statements used for debugging.The print statements at lines 1068-1069 and 1081-1082 appear to be used for debugging purposes. It's a good practice to remove such statements before merging the code to keep the codebase clean.
Consider removing these print statements:
- print(f"symmetry_ind: {symmetry_ind}") - print(f"symmetry: {symmetry}")- print(f"src: {symmetry_src}") - print(f"dst: {symmetry_dst}")Also applies to: 1081-1082
1136-1136
: Remove print statements used for debugging.The print statements at lines 1136 and 1138 appear to be used for debugging purposes. It's a good practice to remove such statements before merging the code to keep the codebase clean.
Consider removing these print statements:
- print(f"skeleton_dict: {skeleton_dict}")
- print(f"json_str: {json_str}")
Also applies to: 1138-1138
1141-1167
: Remove commented-out code.The commented-out code from lines 1141 to 1167 appears to be the previous implementation using
jsonpickle
and handling node mapping. Since it is no longer needed, it's a good practice to remove such commented-out code to keep the codebase clean and maintainable.Consider removing the commented-out code:
- # jsonpickle.set_encoder_options("simplejson", sort_keys=True, indent=4) - # if node_to_idx is not None: - # indexed_node_graph = nx.relabel_nodes( - # G=self._graph, mapping=node_to_idx - # ) # map nodes to int - # else: - # indexed_node_graph = self._graph - - # # Encode to JSON - # graph = json_graph.node_link_data(indexed_node_graph) - - # # SLEAP v1.3.0 added `description` and `preview_image` to `Skeleton`, but saving - # # these fields breaks data format compatibility. Currently, these are only - # # added in our custom template skeletons. To ensure backwards data format - # # compatibilty of user data, we only save these fields if they are not None. - # if self.is_template: - # data = { - # "nx_graph": graph, - # "description": self.description, - # "preview_image": self.preview_image, - # } - # else: - # data = graph - - # json_str = jsonpickle.encode(data) - - # return json_str
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- sleap/skeleton.py (2 hunks)
Additional comments not posted (1)
sleap/skeleton.py (1)
1092-1134
: LGTM!The code for building the
skeleton_dict
dictionary looks good. It correctly handles the cases for template and non-template skeletons, includes the necessary graph properties, and populates thenodes
andlinks
fields appropriately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range and nitpick comments (2)
test_Skeleton_to_json.v006.ipynb (2)
60-103
: Remove or refactor large blocks of commented-out code for readabilityThe code cell contains extensive commented-out code from the old SLEAP implementation. To enhance readability and maintainability, consider removing this code or moving it to documentation if needed for future reference.
112-120
: Clean up commented-out test code for clarityThis cell includes commented-out code related to testing with a skeleton from a labels file created with
sleap-io
. If this code is no longer necessary, please remove it to keep the notebook focused and easy to read.
"# Save to json using the old to_json method (deprecated)\n", | ||
"test_skeleton_json = test_skeleton.to_json()\n", | ||
"print(f'test_skeleton_json: {test_skeleton_json}')\n", | ||
"\n", | ||
"# Save json string to file\n", | ||
"with open(r'test_skeleton_serialization.json', 'w') as f:\n", | ||
" f.write(test_skeleton_json)\n", | ||
" print(f'Wrote test_skeleton_json to file')" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace deprecated to_json
method with the new serialization approach
The to_json
method is deprecated, and relying on jsonpickle
contradicts the PR's objective to eliminate its usage. Please update the serialization code to use the new manual serialization method that aligns with the refactored Skeleton
class.
"# Load the json file\n", | ||
"test_skeleton_from_json = Skeleton.from_json(test_skeleton_json)\n", | ||
"print(f'test_skeleton_from_json: {test_skeleton_from_json}')" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolve the TypeError: unhashable type: 'dict'
during deserialization
Deserializing the skeleton with Skeleton.from_json(test_skeleton_json)
results in a TypeError: unhashable type: 'dict'
. This error likely occurs because the nodes are not properly reconstructed as hashable objects. Update the deserialization logic to align with the new JSON structure and ensure that all nodes and edges are correctly hashed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range and nitpick comments (2)
sleap/skeleton.py (2)
1035-1036
: Simplifyedge_type
assignmentThe conditional checks
if edge_ind == 0
andif symmetry_ind == 0
to assignedge_type
differently may not be necessary and can be simplified for clarity.Consider initializing
edge_type
before the loops or handling it uniformly within the loops.Also applies to: 1071-1072
1123-1125
: Ensure consistent formatting in list comprehensionIn building the
nodes
list withinskeleton_dict
, ensure consistent formatting and indentation of the list comprehension for better readability.Review the formatting to align with PEP 8 guidelines for list comprehensions.
node_to_id = {} | ||
for node in self.nodes: | ||
if node not in node_to_id: | ||
print(f"node: {node}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debug print
statements
The print
statements throughout the to_json
method appear to be for debugging purposes. These should be removed or replaced with appropriate logging to maintain clean production code.
Apply this diff to remove the print
statements:
- print(f"node: {node}")
- print(f"node_to_id: {node_to_id}")
- print(f"nodes_dicts: {nodes_dicts}")
- print(f"edge_ind: {edge_ind}")
- print(f"edge: {edge}")
- print(f"edge_type: {edge_type}")
- print(f"source: {source}")
- print(f"node_to_id[source]: {node_to_id[source]}")
- print(f"target: {target}")
- print(f"node_to_id[target]: {node_to_id[target]}")
- print(f"edges_dicts: {edges_dicts}")
- print(f"symmetry_ind: {symmetry_ind}")
- print(f"symmetry: {symmetry}")
- print(f"src: {symmetry_src}")
- print(f"dst: {symmetry_dst}")
- print(f"skeleton_dict: {skeleton_dict}")
- print(f"json_str: {json_str}")
Also applies to: 1017-1017, 1019-1019, 1024-1026, 1033-1036, 1042-1046, 1058-1058, 1062-1063, 1075-1076, 1130-1132
sleap/skeleton.py
Outdated
"source": {"py/id": self.nodes.index(node)}, | ||
"target": {"py/id": self.nodes.index(node)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the use of undefined variable node
In lines 1053-1054, the variable node
is undefined in this context. This will result in a NameError
.
Replace node
with source
and target
respectively:
- "source": {"py/id": self.nodes.index(node)},
- "target": {"py/id": self.nodes.index(node)},
+ "source": {"py/id": self.nodes.index(source)},
+ "target": {"py/id": self.nodes.index(target)},
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"source": {"py/id": self.nodes.index(node)}, | |
"target": {"py/id": self.nodes.index(node)}, | |
"source": {"py/id": self.nodes.index(source)}, | |
"target": {"py/id": self.nodes.index(target)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node_to_idx: optional dict which maps :class:`Node`s to index | ||
in some list. This is used when saving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused parameter node_to_idx
from to_json
method
The parameter node_to_idx
is not used within the to_json
method. Consider removing it to simplify the method signature and avoid confusion.
Closed in favor of: |
Description
skeleton.json
wherejsonpickle
is encoding the skeleton graph dictionary in a way that breaks backwards compatibility, specifically:__setstate__
from "legacy" tuple format soon python-attrs/attrs#1091) is to remove support for deserializing tuples after giving everyone a few versions of attrs to convert their data over to a dictionary serialization.""" from Error during Inference. ValueError: Found Multiple nodes named (name). #1918 .sleap/sleap/skeleton.py
Lines 986 to 1028 in 1370782
jsonpickle
and manually serialize our skeleton objects to json strings.sleap-io
to do thissleap/sleap/skeleton.py
Line 1057 in 1370782
Types of changes
Does this address any currently open issues?
Outside contributors checklist
Thank you for contributing to SLEAP!
❤️
Summary by CodeRabbit
Summary by CodeRabbit
to_json
method for the Skeleton class, enhancing the serialization of skeleton data with a more structured representation of nodes and edges.