Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate turn type dynamics #1

Merged
merged 55 commits into from
Dec 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
8dbd860
code from notebook
bvreede Mar 31, 2023
477b7e9
merge main
bvreede Apr 7, 2023
214a7b3
add turndynamics code from notebook
bvreede Apr 7, 2023
2d1a6f0
instructions for disabling the bloody githook
bvreede Apr 7, 2023
d2b11b0
move code from test to notebook
bvreede Apr 7, 2023
6289d45
Merge branch 'main' into first-functions
bvreede Nov 2, 2023
da8ef4d
remove notebook from repository
bvreede Nov 2, 2023
1f77d4d
start adding utterance functions
bvreede Nov 3, 2023
f76aaba
add calculated fields to dataclass
bvreede Nov 3, 2023
3736596
add python 3.6 to show that it breaks
bvreede Nov 7, 2023
99a18d2
add python 3.8 to show that it breaks
bvreede Nov 7, 2023
808e3e2
add python 3.6 to show that it breaks
bvreede Nov 7, 2023
be49481
add python 3.8 to show that it breaks
bvreede Nov 7, 2023
5d8862b
remove earlier python versions again
bvreede Nov 7, 2023
8d55ced
add fields to initial data class
bvreede Nov 8, 2023
ea95de3
implement until method calculating time differences between utterances
bvreede Nov 8, 2023
bf1bb60
inmplement subconversation and until next method at conversation object
bvreede Nov 8, 2023
4edab6b
autopep8
bvreede Nov 8, 2023
31a98d3
elaborate subconversation
bvreede Nov 9, 2023
642c941
move time processing to utterance
bvreede Nov 10, 2023
308ab94
object oriented and small fixes
bvreede Nov 10, 2023
d5c62f9
refactor post init
bvreede Nov 10, 2023
12a3d3a
add subconversation functionaliry
bvreede Nov 10, 2023
1fb7323
subconversation can select based on index or time
bvreede Nov 14, 2023
51f1a9e
make linter happy
bvreede Nov 14, 2023
f7db0f6
also pack arguments for second subconversation test
bvreede Nov 14, 2023
47dc536
fix linting issues
bvreede Nov 14, 2023
aaff48a
add dyadic property
bvreede Nov 14, 2023
7ee747f
apply conversation wide calculations dyadic and time to nxt
bvreede Nov 16, 2023
c0593cb
subconversation is internal
bvreede Nov 21, 2023
319ca96
count number of participants
bvreede Nov 21, 2023
b8ca53c
calculate FTO
bvreede Nov 21, 2023
6a34489
remove old code
bvreede Nov 21, 2023
c5c3430
address linter comments
bvreede Nov 21, 2023
7802063
address linter issues and update fto calculation
bvreede Nov 21, 2023
3bbeb00
fix noqa
bvreede Nov 21, 2023
cc91bbc
fix noqa
bvreede Nov 21, 2023
61a1950
calculations update in metadata corrected
bvreede Nov 21, 2023
f79b5ea
allow warning supppression on empty conversations inside subconversation
bvreede Nov 24, 2023
d4c9880
refer to hidden _utterances instead of property
bvreede Nov 24, 2023
20c65db
allow participant counting to exclude None
bvreede Nov 24, 2023
00ad82a
add test for FTO calculation
bvreede Nov 24, 2023
90c4ac6
ensure participant count does not include future utterances
bvreede Nov 24, 2023
69912f6
split subconversation into two functions
bvreede Nov 28, 2023
d62a868
fix linter issue
bvreede Nov 28, 2023
9229e8a
update example notebook
bvreede Nov 28, 2023
574a729
Update sktalk/corpus/conversation.py
bvreede Nov 29, 2023
473e753
Update sktalk/corpus/conversation.py
bvreede Nov 29, 2023
e076b6c
add comments re: error
bvreede Nov 29, 2023
3af95fc
Update sktalk/corpus/conversation.py
bvreede Nov 29, 2023
4c7ac0c
rewrite FTO calculation
bvreede Nov 30, 2023
ada29ce
rename overlap function to make it available
bvreede Nov 30, 2023
13829bb
update FTO calculation to account for partial overlap
bvreede Dec 1, 2023
2f64c94
refactor overlap functions
bvreede Dec 1, 2023
b1e096f
Update sktalk/corpus/parsing/cha.py
bvreede Dec 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .githooks/pre-commit
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

### To enable this githook, run:
### git config --local core.hooksPath .githooks
### to disable:
### git config --unset core.hooksPath

echo "Script $0 triggered ..."

Expand Down
110 changes: 98 additions & 12 deletions docs/notebooks/example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
{
"data": {
"text/plain": [
"<sktalk.corpus.conversation.Conversation at 0x10ea2bd60>"
"<sktalk.corpus.conversation.Conversation at 0x116bc4af0>"
]
},
"execution_count": 2,
Expand Down Expand Up @@ -92,16 +92,16 @@
{
"data": {
"text/plain": [
"[Utterance(utterance='0', participant='S', time=(0, 1500), begin='00:00:00.000', end='00:00:01.500', metadata=None),\n",
" Utterance(utterance=\"mm I'm glad I saw you⇗\", participant='S', time=(1500, 2775), begin='00:00:01.500', end='00:00:02.775', metadata=None),\n",
" Utterance(utterance=\"I thought I'd lost you (0.3)\", participant='S', time=(2775, 3773), begin='00:00:02.775', end='00:00:03.773', metadata=None),\n",
" Utterance(utterance=\"⌈no I've been here for a whi:le⌉,\", participant='H', time=(4052, 5515), begin='00:00:04.052', end='00:00:05.515', metadata=None),\n",
" Utterance(utterance='⌊xxx⌋ (0.3)', participant='S', time=(4052, 5817), begin='00:00:04.052', end='00:00:05.817', metadata=None),\n",
" Utterance(utterance=\"⌊hm:: (.) if ʔI couldn't boʔrrow, (1.3) the second (0.2) book of readings fo:r\", participant='S', time=(6140, 9487), begin='00:00:06.140', end='00:00:09.487', metadata=None),\n",
" Utterance(utterance='commu:nicating acro-', participant='H', time=(12888, 14050), begin='00:00:12.888', end='00:00:14.050', metadata=None),\n",
" Utterance(utterance='no: for family gender and sexuality', participant='H', time=(14050, 17014), begin='00:00:14.050', end='00:00:17.014', metadata=None),\n",
" Utterance(utterance=\"+≋ ah: that's the second on is itʔ\", participant='S', time=(17014, 18611), begin='00:00:17.014', end='00:00:18.611', metadata=None),\n",
" Utterance(utterance=\"+≋ I think it's s⌈ame family gender⌉ has a second book\", participant='H', time=(18611, 21090), begin='00:00:18.611', end='00:00:21.090', metadata=None)]"
"[Utterance(utterance='0', participant='S', time=[0, 1500], begin='00:00:00.000', end='00:00:01.500', metadata=None, utterance_clean='S x150_1500x15', utterance_list=['S', 'x150_1500x15'], n_words=2, n_characters=13, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance=\"mm I'm glad I saw you⇗\", participant='S', time=[1500, 2775], begin='00:00:01.500', end='00:00:02.775', metadata=None, utterance_clean='S mm Im glad I saw you x151500_2775x15', utterance_list=['S', 'mm', 'Im', 'glad', 'I', 'saw', 'you', 'x151500_2775x15'], n_words=8, n_characters=31, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance=\"I thought I'd lost you (0.3)\", participant='S', time=[2775, 3773], begin='00:00:02.775', end='00:00:03.773', metadata=None, utterance_clean='S I thought Id lost you x152775_3773x15 x153773_4052x15', utterance_list=['S', 'I', 'thought', 'Id', 'lost', 'you', 'x152775_3773x15', 'x153773_4052x15'], n_words=8, n_characters=48, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance=\"⌈no I've been here for a whi:le⌉,\", participant='H', time=[4052, 5515], begin='00:00:04.052', end='00:00:05.515', metadata=None, utterance_clean='H no Ive been here for a while x154052_5515x15', utterance_list=['H', 'no', 'Ive', 'been', 'here', 'for', 'a', 'while', 'x154052_5515x15'], n_words=9, n_characters=38, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance='⌊xxx⌋ (0.3)', participant='S', time=[4052, 5817], begin='00:00:04.052', end='00:00:05.817', metadata=None, utterance_clean='S xxx x154052_5817x15 x155817_6140x15', utterance_list=['S', 'xxx', 'x154052_5817x15', 'x155817_6140x15'], n_words=4, n_characters=34, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance=\"⌊hm:: (.) if ʔI couldn't boʔrrow, (1.3) the second (0.2) book of readings fo:r\", participant='S', time=[6140, 9487], begin='00:00:06.140', end='00:00:09.487', metadata=None, utterance_clean='S hm if ʔI couldnt boʔrrow x156140_9487x15 the second book of readings for x159487_12888x15', utterance_list=['S', 'hm', 'if', 'ʔI', 'couldnt', 'boʔrrow', 'x156140_9487x15', 'the', 'second', 'book', 'of', 'readings', 'for', 'x159487_12888x15'], n_words=14, n_characters=78, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance='commu:nicating acro-', participant='H', time=[12888, 14050], begin='00:00:12.888', end='00:00:14.050', metadata=None, utterance_clean='H communicating acro x1512888_14050x15', utterance_list=['H', 'communicating', 'acro', 'x1512888_14050x15'], n_words=4, n_characters=35, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance='no: for family gender and sexuality', participant='H', time=[14050, 17014], begin='00:00:14.050', end='00:00:17.014', metadata=None, utterance_clean='H no for family gender and sexuality x1514050_17014x15', utterance_list=['H', 'no', 'for', 'family', 'gender', 'and', 'sexuality', 'x1514050_17014x15'], n_words=8, n_characters=47, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance=\"+≋ ah: that's the second on is itʔ\", participant='S', time=[17014, 18611], begin='00:00:17.014', end='00:00:18.611', metadata=None, utterance_clean='S ah thats the second on is itʔ x1517014_18611x15', utterance_list=['S', 'ah', 'thats', 'the', 'second', 'on', 'is', 'itʔ', 'x1517014_18611x15'], n_words=9, n_characters=41, time_to_next=None, dyadic=None, FTO=None),\n",
" Utterance(utterance=\"+≋ I think it's s⌈ame family gender⌉ has a second book\", participant='H', time=[18611, 21090], begin='00:00:18.611', end='00:00:21.090', metadata=None, utterance_clean='H I think its same family gender has a second book x1518611_21090x15', utterance_list=['H', 'I', 'think', 'its', 'same', 'family', 'gender', 'has', 'a', 'second', 'book', 'x1518611_21090x15'], n_words=12, n_characters=57, time_to_next=None, dyadic=None, FTO=None)]"
]
},
"execution_count": 3,
Expand Down Expand Up @@ -225,7 +225,7 @@
{
"data": {
"text/plain": [
"[<sktalk.corpus.conversation.Conversation at 0x10ea2bd60>]"
"[<sktalk.corpus.conversation.Conversation at 0x116bc4af0>]"
]
},
"execution_count": 7,
Expand Down Expand Up @@ -256,6 +256,92 @@
"source": [
"GCSAusE.write_json(path = \"CGSAusE.json\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analyzing turn-taking dynamics\n",
"\n",
"When creating a `Conversation` object, a number of calculations and transformations are performed on the `Utterance` objects within.\n",
"For example, the number of words in each utterance is calculated, and stored under `Utterance.n_words`.\n",
"You can see this for a specific utterance as follows:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cha01.utterances[0].n_words"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"More sophisticated calculations can be performed, but do not happen automatically.\n",
"An example of this is the calculation of the Floor Transfer Offset (FTO) per utterance.\n",
"FTO is defined as the difference between the time that a turn starts, and the end of the most relevant prior turn by the other participant.\n",
"If there is overlap between these turns, the FTO is negative.\n",
"If there is a pause between these utterances, the FTO is positive.\n",
"\n",
"We can calculate the FTOs of the utterances in a conversation:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0, 1500] S - FTO: None\n",
"[1500, 2775] S - FTO: None\n",
"[2775, 3773] S - FTO: None\n",
"[4052, 5515] H - FTO: 279\n",
"[4052, 5817] S - FTO: None\n",
"[6140, 9487] S - FTO: 625\n",
"[12888, 14050] H - FTO: 3401\n",
"[14050, 17014] H - FTO: 4563\n",
"[17014, 18611] S - FTO: 0\n",
"[18611, 21090] H - FTO: 0\n"
]
}
],
"source": [
"cha01.calculate_FTO()\n",
"\n",
"for utterance in cha01.utterances[:10]:\n",
" print(f'{utterance.time} {utterance.participant} - FTO: {utterance.FTO}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To determine which prior turn is the relevant turn for FTO calculation, the following criteria are used to find a relevant utterance prior to an utterance U:\n",
"\n",
"- the relevant utterance must be by another participant\n",
"- the relevant utterance must be the most recent utterance by that participant\n",
"- the relevant utterance must have started more than a specified number of ms before the start of U. This time defaults to 200 ms, but can be changed with the `planning_buffer` argument.\n",
"- the relevant utterance must be partly or entirely within the context window. The context window is defined as 10s (or 10000ms) prior to the utterance U. The size of this window can be changed with the `window` argument.\n",
"- within the context window, there must be a maximum of 2 speakers, which can be changed to 3 with the `n_participants` argument."
]
}
],
"metadata": {
Expand Down
Loading
Loading