Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nosub -p download only grabs first hit in batch #1

Open
hawkrobe opened this issue May 10, 2018 · 19 comments
Open

nosub -p download only grabs first hit in batch #1

hawkrobe opened this issue May 10, 2018 · 19 comments

Comments

@hawkrobe
Copy link

hawkrobe commented May 10, 2018

Thanks for creating this! We'd made a custom shim around cosub to handle batching and bonuses and such but this is much nicer. Just wanted to report an issue. I launched 20 assignments in batch mode. Running cosub-p status reports that there are 3 hits in the batch with results, but when I run cosub -p download it looks like it only pulls results from the first in the list instead of looping through all of them (see screenshot below).

image

@hawkrobe
Copy link
Author

It looks is because when hits are still pending, this condition never returns. This was somewhat unexpected behavior (i.e. I expected to be able to pull down partial results while waiting for others to finish).

@longouyang
Copy link
Owner

Thanks for letting me know -- this is definitely unexpected behavior and should be fixed.

I'm not 100% certain of your diagnosis -- that bit of code you highlighted handles paginated downloads for a single HIT.

At any rate, I'm in a conference push right now, so won't be able to get to this for about a week or so.

@hawkrobe
Copy link
Author

no worries -- good luck with nips 💯

@longouyang longouyang changed the title cosub -p download only grabs first hit in batch nosub -p download only grabs first hit in batch May 13, 2018
@hawkrobe
Copy link
Author

hawkrobe commented Jun 3, 2018

hi long -- any progress on this? we have a few people joining us over the summer who will be starting to run experiments soon and it'd be great if this were taken care of (hopefully won't require too big a change once the issue is diagnosed)

@longouyang
Copy link
Owner

i was on a computer-less trip for a week and then got sidelined by a cold. i think i should be able to take a look at it this week though.

@longouyang
Copy link
Owner

for the example that you posted, what does your directory structure look like?

@longouyang
Copy link
Owner

also, do you have a currently running HIT that can reproduce this problem?

@hawkrobe
Copy link
Author

Yes, it's not specific to any HIT -- any time you turn batch mode on, this happens.

@hawkrobe
Copy link
Author

hawkrobe commented Jul 13, 2018

and there's no special directory structure, I'm creating an empty directory and typing nosub init inside it, then nosub upload

@longouyang
Copy link
Owner

Apologies for the long delay.

In the months since we last discussed this, I've picked this back up a couple times but never managed to reproduce this behavior. And debugging is hard because there's a server side component I can't control and testing in production would cost money.

The next time you encounter this, if it's not urgent, could you expire the HIT, and send me your nosub + auth files for debugging?

@mhtess
Copy link

mhtess commented Aug 1, 2019

Bumping this. Anything we can do to help diagnose the issue?

@longouyang
Copy link
Owner

If you encounter this bug, can you expire the HIT and send me your nosub + auth files for debugging?

@nataliavelez
Copy link

nataliavelez commented Aug 16, 2019

Hi Long,

Thanks for making nosub! I'm having the same issue. Where should I send you my files?

Thanks,
Natalia

Edit: Just sent you the files to your Gmail address!

@nataliavelez
Copy link

nataliavelez commented Aug 17, 2019

Update, for posterity: I was trying to download HITs from an experiment that was split into six batches. nosub was tracking the status of all six:

nosub -p status
Running on production
ID                              Created                Expiration             Assignments  NumPending  NumAvailable  NumCompleted
------------------------------  ---------------------  ---------------------  -----------  ----------  ------------  ------------
3HUR21WDDU2BXS4M681E9WPJ8A1XYA  8/15/2019 11:48:47 AM  8/18/2019 11:48:47 AM  9            0           0             9           
382GHPVPHS4JJNJOXC6JMRW86E234W  8/15/2019 11:48:48 AM  8/18/2019 11:48:48 AM  9            1           0             8           
3ZFRE2BDQ9RB2IER2U3XN3YLE45XZ1  8/15/2019 11:48:49 AM  8/18/2019 11:48:49 AM  9            0           0             9           
3RWB1RTQDJ0R9DBD7DT3G6FEDWQ8PK  8/15/2019 11:48:50 AM  8/18/2019 11:48:50 AM  9            0           0             9           
34R0BODSP1C4P43L016PAA6JXZN5EU  8/15/2019 11:48:51 AM  8/18/2019 11:48:51 AM  9            0           0             9           
3J94SKDEKI2ZXYPL46V54SOYXNMD59  8/15/2019 11:48:51 AM  8/18/2019 11:48:51 AM  5            0           0             5           

Total available: 50
Total completed: 49

But it would only download HITs from the first two batches:

nosub -p download
Running on production
Getting status of HIT 3HUR21WDDU2BXS4M681E9WPJ8A1XYA
We have 9/9 assignments
Getting status of HIT 382GHPVPHS4JJNJOXC6JMRW86E234W
We have 8/9 assignments
1 Skipping 3TESA3PJ32N3VJDXZ0GWS4IYN44MM9
2 Skipping 3OF2M9AATH1842CDTDTALTM3D3RKZJ
3 Skipping 3OVHNO1VE7E0QW631W7NHGRDTUUZDL
4 Skipping 3634BBTX0P7BTQVPSYER5LR89U8IFI
5 Skipping 3VJ40NV2QJ0V8HOTLDWL6M2VCCFOTR
6 Skipping 3VFJCI1K40CU0PU3T5EI1YSBX5RRGU
7 Skipping 3S06PH7KSSH3V0LVTLYVPM71GFHD1A
8 Skipping 3GS6S824SRA5IDSBF31JEH9PQWSWNA
9 Skipping 3TESA3PJ32N3VJDXZ0GWS4IYN44MM9
10 Skipping 3OF2M9AATH1842CDTDTALTM3D3RKZJ
11 Skipping 3OVHNO1VE7E0QW631W7NHGRDTUUZDL
12 Skipping 3634BBTX0P7BTQVPSYER5LR89U8IFI
13 Skipping 3VJ40NV2QJ0V8HOTLDWL6M2VCCFOTR
14 Skipping 3VFJCI1K40CU0PU3T5EI1YSBX5RRGU
15 Skipping 3S06PH7KSSH3V0LVTLYVPM71GFHD1A
16 Skipping 3GS6S824SRA5IDSBF31JEH9PQWSWNA

I sent Long my experiment files and authentication files, and he was unable to reproduce the error. I tried it again myself, and I was able to download files from all six batches:

nosub  -p download
Running on production
Getting status of HIT 3HUR21WDDU2BXS4M681E9WPJ8A1XYA
We have 9/9 assignments
Getting status of HIT 382GHPVPHS4JJNJOXC6JMRW86E234W
We have 8/9 assignments
1 Skipping 3TESA3PJ32N3VJDXZ0GWS4IYN44MM9
2 Skipping 3OF2M9AATH1842CDTDTALTM3D3RKZJ
3 Skipping 3OVHNO1VE7E0QW631W7NHGRDTUUZDL
4 Skipping 3634BBTX0P7BTQVPSYER5LR89U8IFI
5 Skipping 3VJ40NV2QJ0V8HOTLDWL6M2VCCFOTR
6 Skipping 3VFJCI1K40CU0PU3T5EI1YSBX5RRGU
7 Skipping 3S06PH7KSSH3V0LVTLYVPM71GFHD1A
8 Skipping 3GS6S824SRA5IDSBF31JEH9PQWSWNA
9 Downloaded 34T446B1C1RTJJUZX6ZKRB69Q3HC0K
Getting status of HIT 3ZFRE2BDQ9RB2IER2U3XN3YLE45XZ1
We have 0/9 assignments
1 Downloaded 3FE2ERCCZYLXGPVM4WN11C1YN9UOPM
2 Downloaded 3N2BF7Y2VR7H35CM830J0ZA26I5MH0
3 Downloaded 36W0OB37HXRH2CB5NSQD816BLMNZHU
4 Downloaded 3R2PKQ87NXLHZ0N6ELI40BBH6VKMIT
5 Downloaded 36NEMU28XGQZ0V7B32MJ12KPL6IWMA
6 Downloaded 3HPZF4IVNN6QGEQK5EATJ30T20ECY7
7 Downloaded 3TEM0PF1Q6A3OB0DX2UVBHORBNBD07
8 Downloaded 33JKGHPFYD79D1YXB1VW5J7YSUEMN9
9 Downloaded 3LOTDFNYA8CTULFUHFI66C0GO56FWM
Getting status of HIT 3RWB1RTQDJ0R9DBD7DT3G6FEDWQ8PK
We have 0/9 assignments
1 Downloaded 3A4NIXBJ77CJP1VCZOFHSR39NZEML7
2 Downloaded 3LQ8PUHQFM5V7MMVNO2R863LP9LIHG
3 Downloaded 3ZAK8W07I5RP5DBZJXKKLHSFXN6U0Y
4 Downloaded 3ZSY5X72NYOIS2B1HORQ671DORIORE
5 Downloaded 3IUZPWIU1PK4A778IQ93MPSLW1NKWT
6 Downloaded 339ANSOTR6FM9CN3T95OLYJDHQPIKK
7 Downloaded 3N8OEVH1FS3FVPJLPWAMCCW8D4YOO5
8 Downloaded 3SNLUL3WO502290L8Q63J1K30LHUL7
9 Downloaded 3FIJLY1B6VH3ACIT5T4BEISKV83FPR
Getting status of HIT 34R0BODSP1C4P43L016PAA6JXZN5EU
We have 0/9 assignments
1 Downloaded 3CP1TO84PUEFG8OYXJGQWTTEV4Y259
2 Downloaded 3VE8AYVF8NAI4KJCJC20SWXU6ZL8FK
3 Downloaded 38BQUHLA9XDRVMY9CGV9K873I4JMOO
4 Downloaded 3TK8OJTYM2YS694J589FW4V6J48VPG
5 Downloaded 30X31N5D6435RDHDMUMDVMM4LE8AS4
6 Downloaded 3JWH6J9I9TQDUAU0KC5NR3W4GVANBL
7 Downloaded 3Z2R0DQ0JIRFCRHR8K9T0NZR8YO2E3
8 Downloaded 3KGTPGBS6YYW1NEDYKOY45LUULQU2I
9 Downloaded 3AWETUDC935HY7MPTA8Y8D7QLZ2ZIN
Getting status of HIT 3J94SKDEKI2ZXYPL46V54SOYXNMD59
We have 0/5 assignments
1 Downloaded 38F5OAUN5OPYI25Z4XYUHHDCBTA7HA
2 Downloaded 3R9WASFE20TXOGKZS22D5GUY57AZFW
3 Downloaded 3ZSANO2JCGK0N4YLXA859NXVISAFS7
4 Downloaded 3DQQ64TANHY5LY4OVIEBI6EKQBTWP6
5 Downloaded 34J10VATJGB8KFLY6EPLHKVMGYYIQX

I have no idea why the problem fixed itself. Maybe it was a fluke? I'll let you know if I run into this again!

@hawkrobe
Copy link
Author

hawkrobe commented Sep 28, 2019

@nataliavelez : I assume the problem was fixed after the final participant finished. it seems to be 'blocked' by whichever batch isn't yet complete! the problem specifically happens when some HITs are still pending, which makes it difficult to send @longouyang the files needed to reproduce it in time, i.e. before they're no longer pending. For example, I'm running into it now with this compensation HIT:

➜   nosub -p status
Running on production
ID                              Created               Expiration            Assignments  NumPending  NumAvailable  NumCompleted
------------------------------  --------------------  --------------------  -----------  ----------  ------------  ------------
3KA7IJSNW656VL2EL0DBAVQVEO6BP5  9/27/2019 3:10:36 AM  10/1/2019 3:10:36 AM  9            0           9             0
3511RHPADVE3K745P56UTTMFR08LRU  9/27/2019 3:10:36 AM  10/1/2019 3:10:36 AM  7            0           6             1
3Z3R5YC0P3NU0U717J8RYVIDIPJFTW  9/27/2019 3:07:09 AM  9/29/2019 3:07:09 AM  9            1           8             0

Total available: 25
Total completed: 1
➜  nosub -p download
Running on production
Getting status of HIT 3Z3R5YC0P3NU0U717J8RYVIDIPJFTW
We have 0/1 assignments

It hangs without being able to download the completed HIT, and note that the HIT ID that blocks it is the one with the pending HIT...

@nataliavelez
Copy link

nataliavelez commented Sep 28, 2019 via email

@hawkrobe
Copy link
Author

Hurrah for generalizing from multiple examples! (I couldn't see what my cases had in common until I saw yours!)

@longouyang
Copy link
Owner

Just chiming in here to signal that I'm keeping tabs on this.

I'm trying to reconcile this hypothesis with some earlier detective work that Natalia and I had done which seems inconsistent with the explanation (but maybe there are multiple issues?). Summarizing that investigation: she had sent me a HIT which was having the issue but which we had expired for debugging purposes. I didn't find any bugs but eventually (after a few days) downloading the results Just Worked, even though no new completions occurred.

My current thinking is that Robert's explanation is likely right but there's other weirdness going on as well. I'll devote some time to more investigation this week.

@hawkrobe
Copy link
Author

hmmm, interesting! one possible reconciliation (still just brainstorming) is that participants can hold on to an assignment as 'pending' for some time after the HIT is officially expired by the experimenter, with that additional time set by the assignments duration. Not sure what the assignment duration was, but maybe it started working after they eventually returned it (thus why no further completions were recorded; it moved out of the 'pending' column to the 'available' column?)

It's also likely there's just totally other weirdness explaining it! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants