Module to write to csv file #47

Morrizzzzz · 2023-11-23T13:15:06Z

Closes #35

…ties

bvreede · 2023-12-19T14:17:23Z

@carschno — if you want to try the new functionality (.write_csv()), take a look at the example notebook in the docs/ folder for demo code!

carschno

Added some minor suggestions.

carschno · 2023-12-19T14:23:00Z

sktalk/corpus/conversation.py

+        self,
+        utterances: list["Utterance"],
+        metadata: Optional[dict] = None,
+        suppress_warnings: bool = False  # noqa: F821


Why is the noqa necessary here? I don't see why an F821 (undefined name) would be triggered here.

carschno · 2023-12-19T14:28:33Z

sktalk/corpus/conversation.py

+    @property
+    def metadata_df(self):
+        """Return the conversation metadata as a pandas dataframe."""
+        if not hasattr(self, "_metadata_df"):


In such cases, I prefer to set the respective attribute to None in __init__():

self._metadata_df = None

Consequently, this check can be replaced:

Suggested change

if not hasattr(self, "_metadata_df"):

if self._metadata_df is None:

I find this a bit more explicit.

carschno · 2023-12-20T10:21:52Z

sktalk/corpus/corpus.py

+    @property
+    def metadata_df(self):
+        """Return the corpus metadata as a pandas dataframe."""
+        if not hasattr(self, "_metadata_df"):


As suggested in conversation.py, I would explicitly set self._metadata_df = None in __init__.py, and replace this check with:

Suggested change

if not hasattr(self, "_metadata_df"):

if self._metadata_df is None:

carschno · 2023-12-20T10:22:27Z

sktalk/corpus/corpus.py

+    @property
+    def utterance_df(self):
+        """Return the corpus utterances as a pandas dataframe."""
+        if not hasattr(self, "_utterance_df"):


As above, preferably check for None and set explicitly in the constructor.

carschno · 2023-12-20T10:32:25Z

sktalk/corpus/write/writer.py

        """
        _path = Path(path).with_suffix(".json")

        object_dict = self.asdict()

        with open(_path, "w", encoding='utf-8') as file:
            json.dump(object_dict, file, indent=4)
+        print("Object saved to", _path)
+
+    def write_csv(self, path: str = "./file.csv"):


Very minor detail, but I find it slightly confusing that file.csv is the default, whereas that literal filename will never be used.
Given that two files are required, I would perhaps split the argument into two for making their respective purposes explicit: directory and file prefix (and optionally suffix too). With reasonably defaults, this should not add more load to the caller, e.g.:

Suggested change

def write_csv(self, path: str = "./file.csv"):

def write_csv(self, directory: str = ".", prefix: str = "file", suffix: str = ".csv"):

In the implementation, you can now construct the different files like this:

(Path(directory) / "_".join(prefix, specifier)).with_suffix(suffix)

with specifier being "metadata" or "utterances".

Thanks for this comment! I like to keep the workflow for csv saving and json saving similar, so I'm keeping the argument structure, but I've opted to use the given filename for the utterance db, and add a _metdata suffix only for the metadata csv file.

tests/corpus/writer/test_writer.py

sonarqubecloud · 2023-12-20T14:04:42Z

Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
94.1% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

initial write csv module

bd583a2

Morrizzzzz marked this pull request as draft November 23, 2023 13:15

Morrizzzzz added 4 commits November 23, 2023 15:00

draft not finished csv

82f6047

added correct writer

0898f1b

write three seperate csv files

a207600

added unique id to csv files

fe3e7de

bvreede mentioned this pull request Dec 1, 2023

Generate turn type dynamics #1

Merged

6 tasks

Morrizzzzz and others added 23 commits December 5, 2023 09:29

Merge branch 'main' into csv_writer_version2

042a608

added tests for path csv writer

20449db

added test to check outcome csv file

9ef3dfb

fixed csv writer extra row issue

3e54dec

make linter happy by replacing id

c856995

conversation ID is generated when conversation is initialized

52e7656

add empty conversation test

2b3023b

parametrize test for csv writer

15fb31a

parametrize utterance test

c96e4e3

fix linter issues

5cadbd7

remove participants output

ba6b6d0

only write utterancecsv if necessary

da6acbc

change variable names for readability

7407bf4

do not overwrite metadata when calculating

4cc1dcb

write nested dictionaries to their own files

0ecefe1

merge branch main and fix conflicts

bca9d3d

move tests to separate file

3c56bee

replace conversation id by source

207fd20

write metadata as pandas df

b7fcbb2

remove unused code and add pandas dependency

c70e4ab

write output files with optional flags and allow for dataframe proper…

f2e6b81

…ties

add source standard to metadata

866515c

reorder tests and clarify intent

ffee749

bvreede added 9 commits December 14, 2023 12:45

linter

18d2662

fix linting issue

f8cf890

fix linting issue

8670703

noqa to deal with linter complaint

eb2cc22

add docstring and comments, remove unused code

dcb8d2a

csv writer to mixin class and add tests for corpus

d17a280

remove unused imports

5636719

add reporting

b557281

update notebook with csv functionality and reorder

8520be6

bvreede marked this pull request as ready for review December 19, 2023 14:14

bvreede requested a review from carschno December 19, 2023 14:16

carschno approved these changes Dec 20, 2023

View reviewed changes

bvreede added 3 commits December 20, 2023 14:59

apply reviewer comments: explicitly declare dataframe attributes

2ba4281

apply reviewer comments: update output path for utterance db

eef8d1b

update example notebook with filename changes

db3aa53

bvreede merged commit b77821f into main Dec 20, 2023
8 checks passed

bvreede deleted the csv_writer_version2 branch December 20, 2023 14:07

bvreede mentioned this pull request Jan 5, 2024

Produce dataframe from Conversation object #29

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Module to write to csv file #47

Module to write to csv file #47

Morrizzzzz commented Nov 23, 2023 •

edited by bvreede

Loading

bvreede commented Dec 19, 2023

carschno left a comment

carschno Dec 19, 2023

carschno Dec 19, 2023

carschno Dec 20, 2023

carschno Dec 20, 2023

carschno Dec 20, 2023

bvreede Dec 20, 2023

sonarqubecloud bot commented Dec 20, 2023

	if not hasattr(self, "_metadata_df"):
	if self._metadata_df is None:

	def write_csv(self, path: str = "./file.csv"):
	def write_csv(self, directory: str = ".", prefix: str = "file", suffix: str = ".csv"):

Module to write to csv file #47

Module to write to csv file #47

Conversation

Morrizzzzz commented Nov 23, 2023 • edited by bvreede Loading

bvreede commented Dec 19, 2023

carschno left a comment

Choose a reason for hiding this comment

carschno Dec 19, 2023

Choose a reason for hiding this comment

carschno Dec 19, 2023

Choose a reason for hiding this comment

carschno Dec 20, 2023

Choose a reason for hiding this comment

carschno Dec 20, 2023

Choose a reason for hiding this comment

carschno Dec 20, 2023

Choose a reason for hiding this comment

bvreede Dec 20, 2023

Choose a reason for hiding this comment

sonarqubecloud bot commented Dec 20, 2023

Quality Gate passed

Morrizzzzz commented Nov 23, 2023 •

edited by bvreede

Loading