Add Sparsified sgd #75

negar-foroutan · 2018-11-08T10:46:50Z

This adds an implementation of sparsifiedSGD (the sparse version of SGD). An optimizer class, a scheduler function, and a communication function are added here.

Panaetius · 2018-11-08T13:27:48Z

mlbench/refimpls/pytorch/controlflow/controlflow.py

        optimizer.step()

+        if options.model_name == 'logistic_regression' and options.train_validate:
+            t = options.runtime['current_epoch'] * options.train_num_samples_per_device + batch_idx * options.batch_size


Single letter variable names are usually not that good for readability (except for something like i as a counter in a for loop), verbose variable names make the code more readable.

I agree, I'll fix it.

Panaetius · 2018-11-08T13:42:11Z

mlbench/refimpls/pytorch/controlflow/controlflow.py

+
+        for weight in estimated_weights:
+            w = weight.squeeze()
+            batch_loss = np.log(1 + np.exp(-target * (data @ w)))


This is a softmargin loss, right? Could use https://pytorch.org/docs/stable/nn.html#softmarginloss and https://pytorch.org/docs/stable/torch.html#torch.matmul here.

Especially since numpy ops are on the CPU, not GPU, so the .cuda() earlier would just waste time

Yes, I'll use the softmargin loss instead, thanks.

Panaetius · 2018-11-08T13:44:41Z

mlbench/refimpls/pytorch/controlflow/controlflow.py

+
+    train_loss = global_average(loss, num_samples).item()
+
+    l2_loss = sum(weight.norm(2) ** 2 for weight in estimated_weights).item()


Same as above, could use https://pytorch.org/docs/stable/nn.html#l1loss and https://pytorch.org/docs/stable/nn.html#torch.nn.MSELoss

As we just need to calculate L1 and L2 norm of a tensor here, I think using loss functions make it complicated.

Panaetius · 2018-11-08T13:49:46Z

mlbench/refimpls/pytorch/optim/optimizer.py

+
+        self.num_coordinates = sparse_grad_size
+
+    def __setstate__(self, state):


the super class method should be automatically called if the child doesn't have the method, so wrapping it like this is unnecessary.

Yes, you're right. I forgot to remove it.

Panaetius · 2018-11-08T14:10:52Z

mlbench/refimpls/pytorch/optim/optimizer.py

+        params_sparse_tensors = []
+
+        for ind, param in enumerate(model.parameters()):
+            self.state[param]['memory'] += param.grad.data * lr[0]


If a layer or similar gets manually assigned a different tensor, the dict-key here would change (it's id(parameter) which maps to the python object identifier). This could be prevented by using model.named_parameters() which returns a unique name for each parameter along with the values. Then that name could be used as dict key.

I cannot use the name as the dict key, because the key here is not the name of the parameter, it's the parameter itself.

negar-foroutan added 8 commits October 22, 2018 16:33

implementation of sparsifiedSGD optimizer

89bc359

Add a new aggregate function to handle sparsifiedSGD

30301c5

Add learning scheduler for both sgd and sparsifiedSGD

c252224

Modifications to support sparsifiedSGD

2efaca4

change aggregate function to support random sparsification.

fce28a9

fix a problem in _random_sparsify function

c399a29

minor changes

d3fb819

minor modifications and cleaning the code

839039c

Panaetius reviewed Nov 8, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Sparsified sgd #75

Add Sparsified sgd #75

negar-foroutan commented Nov 8, 2018

Panaetius Nov 8, 2018

negar-foroutan Nov 8, 2018

Panaetius Nov 8, 2018

negar-foroutan Nov 8, 2018

Panaetius Nov 8, 2018

negar-foroutan Nov 8, 2018

Panaetius Nov 8, 2018

negar-foroutan Nov 8, 2018

Panaetius Nov 8, 2018

negar-foroutan Nov 8, 2018


		train_loss = global_average(loss, num_samples).item()

		l2_loss = sum(weight.norm(2) ** 2 for weight in estimated_weights).item()


		self.num_coordinates = sparse_grad_size

		def __setstate__(self, state):

Add Sparsified sgd #75

Are you sure you want to change the base?

Add Sparsified sgd #75

Conversation

negar-foroutan commented Nov 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment