From b70a25931b41c8569c610b3e617a1699ce0fe8f9 Mon Sep 17 00:00:00 2001 From: Clayton Mellina Date: Tue, 5 Apr 2016 23:06:26 -0700 Subject: [PATCH] Update README.md --- README.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index cef61cb..da10e3e 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,18 @@ # Spark Partition Server -`spark-partition-server` is a set of light-weight Python components to launch servers on the executors of a Spark cluster. +`spark-partition-server` is a set of light-weight Python components to launch servers on the executors of an Apache Spark cluster. ## Overview -Spark is designed for manipulating and distributing data within the cluster, but not for allowing clients to interact with the data directly. `spark-partition-server` provides primitives for launching arbitrary servers on partitions of an RDD, registering and managing the partitions servers on the driver, and collecting any resulting RDD after the partition servers are shutdown. +[Apache Spark](https://spark.apache.org/) is designed for manipulating and distributing data within a cluster, but not for allowing clients to interact with the data directly. `spark-partition-server` provides primitives for launching arbitrary servers on partitions of an RDD, registering and managing the partitions servers on the driver, and collecting any resulting RDD after the partition servers are shutdown. -There are many use-cases such as building ad hoc search clusters to query data more quickly by skipping Spark's job planning, allowing external services to interact directly with in-memory data on Spark as part of a computing pipeline, and enabling distributed computations amongst executors involving direct communication. Spark Partition Server itself provides building blocks for these use cases. +There are many use-cases such as building ad hoc search clusters to query data more quickly by skipping Spark's job planning, allowing external services to interact directly with in-memory data on Spark as part of a computing pipeline, and enabling distributed computations amongst executors involving direct communication (eg. [CaffeOnSpark](https://github.com/yahoo/CaffeOnSpark)). Spark Partition Server provides building blocks for these use cases. + +## Installation + +``` +pip install spark-partition-server +``` ## Simple Usage Example