Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running Worker Machines Without External IP Address #26

Closed
obsh opened this issue Aug 20, 2019 · 5 comments
Closed

Running Worker Machines Without External IP Address #26

obsh opened this issue Aug 20, 2019 · 5 comments

Comments

@obsh
Copy link

obsh commented Aug 20, 2019

Hi,

I wonder if there is an option to create worker machines without external IP addresses?
I'm Trying to run large number of pipelines in GCP and stuck with IP address quota.

Regards.

@samanvp
Copy link
Contributor

samanvp commented Aug 20, 2019

Unfortunately each worker needs external IP to communicate back with the main runner.
What I suggest is:

  • Use larger workers to reduce the number of workers and thus number of needed external IP. For example for make_examples stage you can use 2 workers each with 16 cores and 4*16=48GB of memory. Using larger workers cost you more than using smaller workers which is more cost optimized way to run DeepVariant.
  • Serialize your runs instead of paralyzing them. This way you can keep the cost of each run as low as possible (by using smaller worker) but you overall running time will be longer.

Unfortunately there is not a perfect solution; you need to compromise either cost or time.

Please let me know if you need help with setting the input argument to optimize the cost based on the size of the BAM file and the type of analysis.

@obsh
Copy link
Author

obsh commented Aug 20, 2019

Thank you for recommendations! I’ll try to run with larger worker machines.

Sure, will appreciate if you could give any suggestions on the run configuration.
I’m working on a cannabis variants project with a Googler @allenday and I think the goal is to optimize for smaller overall running time. We have 16,000 BAM files with sizes in the range from 60MB to 17GB and reference fa files from 300MB - 1.2GB. We need to produce vcf files.
From experience of running a couple of pipelines we selected make example worker machines with a 60GB RAM and 10 CPU as VMs were failing with "out of memory" error when using with less RAM.

All arguments to the runner:

cmd: |
  ./opt/deepvariant_runner/bin/gcp_deepvariant_runner \
    --project "${PROJECT_ID}" \
    --zones "${ZONES}" \
    --docker_image "${DOCKER_IMAGE}" \
    --docker_image_gpu "${DOCKER_IMAGE_GPU}" \
    --gpu \
    --outfile "${OUTPUT_BUCKET}"/"${OUTPUT_FILE_NAME}" \
    --staging "${OUTPUT_BUCKET}"/"${STAGING_FOLDER_NAME}" \
    --model "${MODEL}" \
    --ref "${INPUT_REF}" \
    --bam "${INPUT_BAM}" \
    --shards 512 \
    --make_examples_workers 16 \
    --make_examples_cores_per_worker 10 \
    --make_examples_ram_per_worker_gb 60 \
    --make_examples_disk_per_worker_gb 200 \
    --call_variants_workers 16 \
    --call_variants_cores_per_worker 8 \
    --call_variants_ram_per_worker_gb 30 \
    --call_variants_disk_per_worker_gb 50

@obsh
Copy link
Author

obsh commented Aug 20, 2019

With following model and images:

MODEL=gs://deepvariant/models/DeepVariant/0.6.0/DeepVariant-inception_v3-0.6.0+cl-191676894.data-wgs_standard
IMAGE_VERSION=0.6.1
DOCKER_IMAGE=gcr.io/deepvariant-docker/deepvariant:"${IMAGE_VERSION}"
DOCKER_IMAGE_GPU=gcr.io/deepvariant-docker/deepvariant_gpu:"${IMAGE_VERSION}"

@samanvp
Copy link
Contributor

samanvp commented Aug 20, 2019

Here are a couple of small changes that will definitely makes your run more efficient:

  • You better set number of shards to be equal to make_examples_workers times make_examples_cores_per_worker, basically one shard per core.
  • Since your BAM files very in size, I'd put them into 2-3 buckets; say less than a 1GB, between 1-10GB, and larger than 10GB. I'd set --make_examples_workers 1 for all 3 groups (to save on external IPs) and --make_examples_cores_per_worker 4, 8, and 16 respectively for three buckets.
  • In my all previous tests it was enough to set 4GB ram per core, both for make_examples and call_variants step. However, it seems for your case this was not enough and you ended up 6GB per core.
  • For call_variants step you are wasting way too much resources. What we recommend in our automatic flag values (pending PR Automatic flags based on size of BAM file #11) for BAM files up to 200GB is 2 workers equipped with GPU. Here I recommend 1 worker with GPU for all BAM sizes.
  • When you are using GPU for call_variants you don't need many cores because GPU will be doing all the heaving lifting. What we recommend is to use 4 cores and 4*4=16GB ram workers equipped with GPU for this stage.

I just want to mention that all my experience of optimizing these flags is for human sample BAM files. I am not really sure what is the density of variants in cannabis. So you might want to apply some fine tuning on top of what I suggested.

Please let me know if there is anything else I can help with.

@obsh
Copy link
Author

obsh commented Aug 21, 2019

Thank you very much for the recommendations and explanation of logic behind it! I'll try to run a new setup this week.

@obsh obsh closed this as completed Aug 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants