- updated _metadata_to_df function #70

liesenf · 2024-09-04T09:01:28Z

Bug description:
write_csv encounters TypeError when metadata is provided in dict format.

Solution:
edited write_csv function that features press_element" function that loops over each element and handles lists, dicts, and strings, or "return as is".
Closes #69

Error message:

TypeError Traceback (most recent call last)
in <cell line: 2>()
1 # Save the corpus as a .csv file locally
----> 2 Dutch_corpus.write_csv(path = "Dutch_corpus.csv")

8 frames
/usr/local/lib/python3.10/dist-packages/sktalk/corpus/write/writer.py in (x)
52 norm = pd.json_normalize(data=metadata, sep="_")
53 df = pd.DataFrame(norm)
---> 54 df[:] = np.vectorize(lambda x: ', '.join(
55 x) if isinstance(x, list) else x)(df)
56 return df

TypeError: sequence item 0: expected str instance, dict found

Key Changes:
Added process_element function: This function handles three cases:
List: Joins the elements with ', '.
Dictionary: Converts the dictionary to a JSON string using json.dumps. Alternatively, you could convert the dictionary to a custom string format, e.g., by joining key-value pairs with a colon.
Other types: Returns the value as-is.
Replaced lambda with process_element: The np.vectorize now applies this more robust function to each element in the DataFrame.
This approach should resolve the TypeError by correctly handling cases where elements in the DataFrame are dictionaries.

sonarqubecloud · 2024-09-04T09:32:03Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
81.8% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

mdingemanse · 2024-09-04T09:40:12Z

looks like a simple enough change — but your branch is out of date with main. Want to merge/rebase?

/edit I have reviewed & approved, feel free to merge this PR

bvreede · 2024-09-04T23:19:09Z

Great quick fix here!

The addition of a playground/ folder with a copied notebook is less standard though — consider removing this...

liesenf · 2024-09-05T07:22:37Z

@bvreede Thanks for spotting - deleted

I was having trouble to remember the setup you introduced back in the days with a notebook for developing that loads the package from a branch.

Was that in the private repo scikit-talk_benchmarking?

- updated _metadata_to_df function

46f6183

liesenf added the bug Something isn't working label Sep 4, 2024

This was linked to issues Sep 4, 2024

Fix TypeError in write_csv function #69

Closed

Streamline sample Colab notebook & avoid (suppress?) errors #66

Closed

liesenf added 2 commits September 4, 2024 11:22

refactoring to pass linter

48b341a

fixed indentation

f77d169

mdingemanse approved these changes Sep 4, 2024

View reviewed changes

mdingemanse merged commit 6fbe8f9 into main Sep 4, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

- updated _metadata_to_df function #70

- updated _metadata_to_df function #70

liesenf commented Sep 4, 2024 •

edited

Loading

sonarqubecloud bot commented Sep 4, 2024

mdingemanse commented Sep 4, 2024 •

edited

Loading

bvreede commented Sep 4, 2024

liesenf commented Sep 5, 2024 •

edited

Loading

- updated _metadata_to_df function #70

- updated _metadata_to_df function #70

Conversation

liesenf commented Sep 4, 2024 • edited Loading

sonarqubecloud bot commented Sep 4, 2024

Quality Gate passed

mdingemanse commented Sep 4, 2024 • edited Loading

bvreede commented Sep 4, 2024

liesenf commented Sep 5, 2024 • edited Loading

liesenf commented Sep 4, 2024 •

edited

Loading

mdingemanse commented Sep 4, 2024 •

edited

Loading

liesenf commented Sep 5, 2024 •

edited

Loading