Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

desi_pipe sync doesn't sync redrock output #563

Closed
sbailey opened this issue Mar 28, 2018 · 4 comments · Fixed by #590
Closed

desi_pipe sync doesn't sync redrock output #563

sbailey opened this issue Mar 28, 2018 · 4 comments · Fixed by #590
Assignees
Labels

Comments

@sbailey
Copy link
Contributor

sbailey commented Mar 28, 2018

My redrock job timed out leaving me partially done on the redshift tasks. desi_pipe sync doesn't recognize that and still leaves all of my tasks in the "ready" state instead of moving those with file output into the "done" state.

Optimistically assigning this to 18.3 'cause it would be really really nice to have this working for this major release.

@sbailey
Copy link
Contributor Author

sbailey commented Mar 28, 2018

Also: oddly none of the tasks are in the "running" state, even though many of them did run and finish and others were mid-stream with running when the job got killed. The command I ran was

desi_pipe tasks --tasktype redshift --states ready,waiting > redshift.tasks
srun -A desi -t 15:00 -C haswell --qos interactive -N 24 -n 736 -c 2 \
  desi_pipe_exec_mpi --tasktype redshift --taskfile redshift.tasks

(note the 15:00 minutes, which I should have made longer)

@tskisner
Copy link
Member

Redshift and spectra tasks are “special”, but I agree that their state should move from ready to running. On the other hand I am not sure they are ever “done”. There is a separate DB table that tracks which pixels are touched for a given exposure, and that is how we can schedule redshift tasks.

Thanks for opening this to track the question.

@sbailey
Copy link
Contributor Author

sbailey commented Mar 29, 2018

Agreed about the ambiguity of "done" since we don't know if a future observation will contribute more spectra to that same healpix. At the same time, if a job times out and leaves some pixels unfinished, we don't want to have to start over again from the beginning.

Suggestion: after redrock finishes a pixel containing all spectra to date, mark it as "done". When desi_pipe update runs to ingest new data, move any spectra/redshifts covered by those data from "done" back into "waiting". Does it already do something like that for the spectra file generation? I see that those do move to "done" when processed.

Related on the redrock side: if the output zbest file already exists, only refit targets that have new data in the input (desihub/redrock#105). That will avoid wasting CPU time if the pipeline does try to rerun a pixel that doesn't need to be updated.

@tskisner tskisner self-assigned this Apr 16, 2018
@tskisner
Copy link
Member

This is related to #558, and I'll fix them both in the same PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants