Skip to content

Commit

Permalink
Adjust some expression in User Guide
Browse files Browse the repository at this point in the history
  • Loading branch information
zehuanw authored May 14, 2020
1 parent 3fd875e commit 628acda
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions docs/hugectr_user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,13 @@ Highlighted features of HugeCTR
## New Features in Version 2.1
HugeCTR version 2.1 is a major update which aims to provide a flexible, fast, scalable and reliable solution for CTR Training. Framework designers can consider it as a high-performance reference design.

* Supporting three important networks: Wide and Deep Learning (WDL)[1], Deep Cross Network (DCN)[2] and DeepFM [3]
* A new embedding implementation `LocalizedSlotSparseEmbedding` which reduces the memory transactions across GPUs and nodes resiliently to the number of GPUs.
* Supporting multiple Embeddings in one network
* Supporting dense feature input, which doesn't need any embdding layer
* Supporting new layers like: Dropout / Split / Reshape / Multiply / FmOrder2 / MultCross / Add
* Check bits in data reader to enable data check and error skip.
* L1 / L2 Regularization
* Supporting three important networks: Wide and Deep Learning (WDL)[1], Deep Cross Network (DCN)[2] and DeepFM [3];
* A new embedding implementation `LocalizedSlotSparseEmbedding` which reduces the memory transactions across GPUs and nodes resiliently to the number of GPUs;
* Supporting multiple Embeddings in one network;
* Supporting dense feature input;
* Supporting new layers like: Dropout / Split / Reshape / Multiply / FmOrder2 / MultCross / Add;
* Check bits in data reader to enable data check and error skip;
* L1 / L2 Regularization.

## Architecture and Supported Networks
To enable large embedding training, the embedding table in HugeCTR is model parallel and distributed across all the GPUs in a homogeneous cluster, which consists of multiple nodes and multiple GPUs. Meanwhile, the dense model such as DNN is data parallel, which has one copy in each GPU (see Fig.1).
Expand Down

0 comments on commit 628acda

Please sign in to comment.