Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamline sample Colab notebook & avoid (suppress?) errors #66

Closed
mdingemanse opened this issue Sep 4, 2024 · 1 comment · Fixed by #70
Closed

Streamline sample Colab notebook & avoid (suppress?) errors #66

mdingemanse opened this issue Sep 4, 2024 · 1 comment · Fixed by #70
Assignees

Comments

@mdingemanse
Copy link

mdingemanse commented Sep 4, 2024

Working through this Colab notebook I noticed its output is not entirely selfexplanatory yet and also it generates some errors that may throw off beginners:

  1. final CSV also includes gaze, should have only speech (this cell should have line 5 uncommented)
  2. parsing EAF generates an error (it works, but looks alarming — can this be suppressed or avoided?)
    /usr/local/lib/python3.10/dist-packages/pympi/Elan.py:1471: UserWarning: Parsing unknown version of ELAN spec... This could result in errors... warnings.warn('Parsing unknown version of ELAN spec... '
  3. saving corpus locally (this cell) throws an error
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

[<ipython-input-12-3af648ec5aa5>](https://localhost:8080/#) in <cell line: 2>()
      1 # Save the corpus as a .csv file locally
----> 2 Dutch_corpus.write_csv(path = "Dutch_corpus.csv")

8 frames

[/usr/local/lib/python3.10/dist-packages/sktalk/corpus/write/writer.py](https://localhost:8080/#) in <lambda>(x)
     52         norm = pd.json_normalize(data=metadata, sep="_")
     53         df = pd.DataFrame(norm)
---> 54         df[:] = np.vectorize(lambda x: ', '.join(
     55             x) if isinstance(x, list) else x)(df)
     56         return df

TypeError: sequence item 0: expected str instance, dict found
@mdingemanse mdingemanse changed the title Streamline sample Colba notebook & avoid (suppress?) errors Streamline sample Colab notebook & avoid (suppress?) errors Sep 4, 2024
@liesenf liesenf linked a pull request Sep 4, 2024 that will close this issue
@liesenf
Copy link
Contributor

liesenf commented Sep 4, 2024

  1. final CSV also includes gaze, should have only speech
  • changed the default and only speech tiers are now selected
  1. parsing EAF generates an error (it works, but looks alarming — can this be suppressed or avoided?)
  • the warning message originates from dependency pympi . I will have to check whether it can be suppressed there. Since it's just a warning, I address 3. first.
  1. saving corpus locally throws error
  • function write_csv encounters a TypeError if metadata is provided in metadata fields. Proposed solution ready for review in linked pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants