Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement distributed deep learning on small master-slave architecture through data parallelism approach? #7

Open
Nafees-060 opened this issue Mar 11, 2021 · 0 comments

Comments

@Nafees-060
Copy link

I am a beginner and I would like to deploy the distributed deep learning model followed by Hadoop on a toy example. Like I want to use three Personals computers (PC), one would be work as a parameter server and the other two would be work as worker machines. Here I initially want to configure the Hadoop over the three machines (do not know exactly how it would be done on the three machines). Then distribute the data into pieces over the two worker machines via the Parameter server machine for training. Suppose I have 10 GB of data, so 5 GB would be a shift to the first worker personal computer and the other 5 GB of data would be allocated to the second PC. Then I would like to apply the data parallelism model synchronously on the data set. What would be the steps to implement distributed deep learning system on these small network machines?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant