Skip to content

Run Husky in a Cluster

YANG Fan edited this page Dec 2, 2016 · 3 revisions

Run Husky in a Cluster

There are two ways to run Husky in a cluster. Husky can be started through YARN, or it can run in a standalone mode.

Husky on YARN

On a YARN cluster, Husky can be started through YARN. The project Husky-on-YARN, is dedicated for this purpose. Users interested in running Husky on YARN may check instructions on that project page.

Husky Standalone

We provide a script exec.sh to run Husky in a distributed cluster. Users may need to modify this script before use. The first thing to do is to define the variable MACHINE_CFG, which should point to a file that contains all host names in the cluster. For example,

MACHINE_CFG=/path/to/config

In /path/to/config:

machine1
machine2
machine3

Next, we need to define BIN_DIR. It's the directory that contains the Husky programs. We recommend users to use NFS for this purpose, since in this case we don't need to copy the program to all machines. The Husky program to run must be copied in BIN_DIR before use. It may look like the following,

BIN_DIR=/some/nfs/husky/release

Advanced Config

In NFS, sometimes when we are testing a program we may update the program (i.e., the binary) quite frequently. This may cause inconsistency (i.e., different machines see different copies) and we want to avoid this. A quick fix is to modify the last statement in this way,

time pssh -t 0 -P -h ${MACHINE_CFG} -x "-t -t" "cd ${BIN_DIR} && ls > /dev/null && ./$@"

To generate core dumps for better analysis we may achieve through the following,

time pssh -t 0 -P -h ${MACHINE_CFG} -x "-t -t" "cd ${BIN_DIR} && ulimit -c unlimited && ./$@"
Clone this wiki locally