WIP: implementation of non-contiguous group lasso penalty GroupL1nc #94

matteofrigo · 2021-03-29T10:04:59Z

This PR adds a new class to the copt.penalty module which defines a non-contiguous version of the existing GroupL1 penalty. The main differences between the two are:

Parsing of the input. GroupL1nc requires just non-overlapping groups.
Definition of B matrix adapted to non-contiguous setting (it's no longer block-diagonal).
Evaluation of _prox_gl adapted to non-contiguous setting.
Result of simple test adapted to new block matrix.

Minor style/PEP8 improvements have been applied to the modified files.

I've still got a couple of concerns on which I would like to have your opinion @fabianp .

The B matrix in prox_factory that I implemented in GroupL1nc is slightly different with respect to the one of GroupL1. The difference is in how the "forgotten" features are encoded in B. Here's a small example. Let n_features = 5 and groups = [(0, 1), (3, 4)], then the B matrix for the two methods is the following:

b_GroupL1 = [[1.0, 1.0, 0.0, 0.0, 0.0],
                       [0.0, 0.0, -1.0, 0.0, 0.0],
                       [0.0, 0.0, 0.0, 1.0, 1.0]]

b_GroupL1nc = [[1.0, 1.0, 0.0, 0.0, 0.0],
                           [0.0, 0.0, 0.0, 1.0, 1.0],
                           [0.0, 0.0, -1.0, 0.0, 0.0]]

I made a few tests and it seems to me that this does not change the result, but I don't have a proof for it. What happens is that we reorder the groups by putting all the "-1" singletons after the groups. Are you aware of any reason why it shouldn't work?

My second concern is about the group structure. Are we good with accepting only non-overlapping groups or do we want a more general code that accepts overlapping groups or even tree structures?

After solving these two points we can think of merging GroupL1 and GroupL1nc, as the latter trivially generalizes the former.

This PR addresses #20 .

This commit adds a new class to the copt.penalty module which defines a non-contiguous version of the existing GroupL1 penalty. The main differences between the two are: - Parsing of the input. GroupL1nc requires just non-overlapping groups. - Definition of block matrix adapted to non-contiguous setting. - Evaluation of _prox_gl adapted to non-contiguous setting. - Result of simple test adapted to new block matrix. Minor style/PEP8 improvements have been applied to the modified files.

coveralls · 2021-03-29T10:13:42Z

Coverage increased (+0.2%) to 88.959% when pulling be2c4e9 on matteofrigo:groupL1-noncontiguous into e5cb995 on openopt:master.

fabianp · 2021-03-29T18:21:03Z

Thanks for the contribution! This already looks great.

I agree that it would be nice to have GroupL1 and GroupL1nc merged as a single object.

I agree with you that intuitively the order in which the forgotten groups appear should not matter for the penalty. This might change if we allow to have a different per-group regularization, but that's not the case yet.

This last part makes me wonder if we should weight the group regularization by the size of the group, as done for instance in Eq. (1) in https://statweb.stanford.edu/~tibs/ftp/sparse-grlasso.pdf . I think this makes a lot of sense when groups have very different sizes. Do you have an opinion on this @matteofrigo ? Of course this is somewhat orthogonal to your pull request, so feel free to ignore this point or postpone for a later contribution

fabianp · 2021-03-29T18:24:11Z

As for the second concern, I would stop at non-overlapping groups, at least for the pull request. The code for more general objects would be significantly different, and which point I would split it in a different object.

This commit merges the GroupL1 classes for contiguous and non-contiguous groups, adding the possibility to apply group-specific penalties. The commit also adds the necessary tests and improves the existing ones. Minor PEP8 issues are fixed.

matteofrigo · 2021-04-02T12:52:48Z

I added the possibility to define group-specific penalties and some more general tests. Also, the two classes are merged back to a single one (GroupL1).

For the moment I added the possibility to use group-wise weights that are like the ones in the paper you mentioned (sqrt(len(g))), it's inverse, and custom weights passed by the user. By default it uses a weight equal to 1 for each group. Anything else that you think it should go there @fabianp ?

matteofrigo · 2021-04-02T14:55:11Z

It seems like my changes to how the B matrix is built broke everything. I'll look into it. For the dense matrices case it works, anyway.

This commit adds the OGroupL1 class for the overlapping groups case. The class is independent w.r.t. GroupL1 as the computation of the proximal operator in the sparse case is quite delicate to deal with. The corresponding tests have been adapted and extended where possible.

matteofrigo · 2021-04-03T14:59:40Z

The problem was in a test where the group sparsity was used with overlapping groups. I added a separate class for that case to keep the two worlds separate as @fabianp suggested in #94 (comment) .

As for the second concern, I would stop at non-overlapping groups, at least for the pull request. The code for more general objects would be significantly different, and which point I would split it in a different object.

To summarise, here's what we did with this PR until now:

Separated overlapping- (OGroupL1) and non-overlapping-groups (GroupL1) cases;
Generalised GroupL1 to non-contiguous groups;
Improved tests for GroupL1;
Adapted tests for overlapping groups;
Fixed minor PEP8 issues.

Anything to add?

fabianp

Thanks for the contribution @matteofrigo ! I'm happy to merge this without the overlapping group lasso part.

I think that the overlapping GL part still requires some discussion that is better to have in a different PR. If you remove that part (leaving it for example for a separa PR) I'm happy to merge this one. Thanks!

fabianp · 2021-04-05T22:56:33Z

copt/penalty.py

@@ -81,7 +195,6 @@ def __call__(self, x):
    def prox(self, x, step_size):


I don't think this is the correct prox for the overlapping group lasso penalty. The prox of this penalty doesn't have a closed form expression but instead requires an iterative scheme (see for example https://papers.nips.cc/paper/2011/file/03c6b06952c750899bb03d998e631860-Paper.pdf or https://www.researchgate.net/profile/Guillaume-Obozinski/publication/221346424_Group_Lasso_with_overlap_and_graph_Lasso/links/0a85e53b2c72e3d517000000/Group-Lasso-with-overlap-and-graph-Lasso.pdf)

I removed the OGroupL1 class. Thanks for the references!

The current implementation of the overlapping group lass is incorrect. This commit removes it from the codebase. The corresponding tests are also removed, with the exception of a test for the VRTOS solver involving overlapping group lasso penalties, which is now commented. Minor PEP8 issues are solved.

fabianp · 2021-04-07T13:13:05Z

tests/test_randomized.py

@@ -188,37 +188,37 @@ def test_gl(groups):
            assert np.linalg.norm(grad_map) < 1e-6


-def test_vrtos_ogl():


why removing the test? This is testing an overlapping group lasso model by adding to proximal functions (for solvers that allow this)

fabianp · 2021-04-07T13:15:37Z

Thanks Matteo. We're almost there, but there's a test that has been commented out while I think its important to keep it, as it's testing a valid implementation of the overlapping group lasso

matteofrigo · 2021-04-23T08:48:44Z

Sorry @fabianp , I've been offline for some days.

I removed the test because I have misunderstood the suggestion to avoid the overlapping groups case.

Speaking about that test, I'm not able to make it pass. I haven't been able to track the bug to its root, but it seems to me that the new definition of the block matrix (in the penalty) changes the definition of the support matrix in a way that makes it wrong. Notice that the only actual difference with the old implementation of the penalty is the order in which we treat the groups. Do you have any low-level way to debug and test the prox_factory function? I haven't found anything except from testing the correctness of the block matrix.

fabianp requested changes Apr 5, 2021

View reviewed changes

fabianp reviewed Apr 7, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: implementation of non-contiguous group lasso penalty GroupL1nc #94

WIP: implementation of non-contiguous group lasso penalty GroupL1nc #94

matteofrigo commented Mar 29, 2021 •

edited

Loading

coveralls commented Mar 29, 2021 •

edited

Loading

fabianp commented Mar 29, 2021

fabianp commented Mar 29, 2021

matteofrigo commented Apr 2, 2021

matteofrigo commented Apr 2, 2021 •

edited

Loading

matteofrigo commented Apr 3, 2021

fabianp left a comment

fabianp Apr 5, 2021

matteofrigo Apr 6, 2021

fabianp Apr 7, 2021

fabianp commented Apr 7, 2021

matteofrigo commented Apr 23, 2021

		@@ -81,7 +195,6 @@ def __call__(self, x):
		def prox(self, x, step_size):

		@@ -188,37 +188,37 @@ def test_gl(groups):
		assert np.linalg.norm(grad_map) < 1e-6


		def test_vrtos_ogl():

WIP: implementation of non-contiguous group lasso penalty GroupL1nc #94

Are you sure you want to change the base?

WIP: implementation of non-contiguous group lasso penalty GroupL1nc #94

Conversation

matteofrigo commented Mar 29, 2021 • edited Loading

coveralls commented Mar 29, 2021 • edited Loading

fabianp commented Mar 29, 2021

fabianp commented Mar 29, 2021

matteofrigo commented Apr 2, 2021

matteofrigo commented Apr 2, 2021 • edited Loading

matteofrigo commented Apr 3, 2021

fabianp left a comment

Choose a reason for hiding this comment

fabianp Apr 5, 2021

Choose a reason for hiding this comment

matteofrigo Apr 6, 2021

Choose a reason for hiding this comment

fabianp Apr 7, 2021

Choose a reason for hiding this comment

fabianp commented Apr 7, 2021

matteofrigo commented Apr 23, 2021

matteofrigo commented Mar 29, 2021 •

edited

Loading

coveralls commented Mar 29, 2021 •

edited

Loading

matteofrigo commented Apr 2, 2021 •

edited

Loading