Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diag, eye, and trace #169

Closed
ivan-pi opened this issue Apr 7, 2020 · 22 comments
Closed

diag, eye, and trace #169

ivan-pi opened this issue Apr 7, 2020 · 22 comments
Labels
implementation Implementation in experimental and submission of a PR specification Discussion and iteration over the API

Comments

@ivan-pi
Copy link
Member

ivan-pi commented Apr 7, 2020

Related to #10

Convenience functions for use in linear algebra. The functions would go into stdlib_experimental_linalg.f90. Simple implementation by looping over the diagonal. Type-generic (integer, real, complex) implementation using the fypp preprocessor.

A draft is available here: https://github.com/ivan-pi/stdlib/blob/ivan-pi/src/stdlib_experimental_linalg.fypp (based on the version from fortran-utils #103 )

eye

Returns the identity matrix

function eye(n) result(res)
integer, intent(in) :: n
real(dp), allocatable :: res(:  ,:)
end function
  • Do we need an optional kind dummy variable for the return type (eye(n [, kind]))? Or just default to double precision?
  • Other names: identity, ... ?
  • Generalize also to higher rank arrays (tensors)?
  • Fixed-size or allocatable output?

trace

Returns the sum of diagonal elements

pure function trace(A) result(res)
real(dp), intent(in) :: A(:,:)
real(dp) :: res
end function
  • How to deal with non-square matrices? Take the sum over the shorter dimension? Run-time shape assertion?
  • Optional axis argument?
  • Diagonal offset?
  • Generalize to higher rank arrays (tensors)?

diag

Create a diagonal matrix from a vector or extract a diagonal from a matrix

! vector to matrix
function diag(v, k) result(res)
real(dp), intent(in) :: v(:)
integer, intent(in), optional :: k ! place elements of v on the k-th diagonal
real(dp), allocatable :: res(:,:)
end function
! matrix to vector
function diag(A, k) result(res)
real(dp), intent(in) :: A(:,:)
integer, intent(in), optional :: k ! extract elements from the k-th diagonal of A
real(dp), allocatable :: res(:)
end function
  • Other names: diagonal, ...
  • Specify which diagonal to use?
  • How to deal with non-square matrices? Extract the diagonal along the shorter dimension? Run-time shape assertion?
  • Generalize to higher rank arrays?
  • Fixed-size or allocatable output?

Other languages

The API's differ slightly between languages, some are more flexible than others.

@certik
Copy link
Member

certik commented Apr 7, 2020

Thank you!

I like this.

I have one comment: I think the function should not return an allocatable --- I think it could be faster to simply return the array and expect the caller to provide a correctly pre-allocated array as a result.

@ivan-pi
Copy link
Member Author

ivan-pi commented Apr 7, 2020

Yes, your comment is related to the discussion in #114. If the LHS is also allocatable (and unallocated), then I think the compilers will do the right thing and not create a temporary array.

Would there be any interest in having two versions, say diag and diag_ (or diagA), one expecting preallocated arrays of correct size, and the other one returning allocatable ones? This way it would be possible to select for speed or flexibility.

@certik
Copy link
Member

certik commented Apr 8, 2020

Thanks for finding the prior discussion. What is the advantage of having it allocatable result? With auto allocate LHS, wouldn't the compiler properly allocate the LHS anyway?

@jvdp1
Copy link
Member

jvdp1 commented Apr 8, 2020

With auto allocate LHS, wouldn't the compiler properly allocate the LHS anyway?

Yes, it does.
And if the compiler doesn't do it (because it is not supported yet), I usually do something like:

...
real, allocatable :: x(:, :)
allocate(x, source = diag(x))

@jvdp1
Copy link
Member

jvdp1 commented Apr 8, 2020

Thank you for this.

eye

  • Do we need an optional kind dummy variable for the return type (eye(n [, kind]))? Or just default to double precision?

I would suggest to support all types, as for the other functions. Would a subroutine not easier to implement and to use for this scenario? E.g.,

call eye(x)
  • Other names: identity, ... ?

eye is good for me.

  • Generalize also to higher rank arrays (tensors)?

I would say yes, while I will probably never use it myself.

diag

  • Other names: diagonal, ...

diag sounds good to me.

  • Specify which diagonal to use?

Yes, as you proposed should be fine. I guess that a negative number could be provided to extract elements below the diagonal?

  • How to deal with non-square matrices? Extract the diagonal along the shorter dimension? Run-time shape assertion?

I am not sure to understand the question. I think you could loop until min(size(x, 1), size(x,2)).

  • Generalize to higher rank arrays?

I would answer yes, while I would never use it myself.

@ivan-pi
Copy link
Member Author

ivan-pi commented Apr 8, 2020

eye

  • Do we need an optional kind dummy variable for the return type (eye(n [, kind]))? Or just default to double precision?

I would suggest to support all types, as for the other functions. Would a subroutine not easier to implement and to use for this scenario? E.g.,

call eye(x)

The downside of a subroutine is then you cannot chain the identity matrix into expressions, e.g.

real :: M(9,9), J(9,9)
M = eye(9) + 1./(tau + 0.5)*(eye(9) - J))

On the other hand as a subroutine it would encourage reuse

real :: M(9,9), J(9,9), identity(9,9)
call eye(identity)
M = identity + 1./(tau + 0.5)*(identity - J))

Then again, I see this only as a convenience function, and not to be used inside some expensive kernel, as it is clear there are more efficient ways to add values to the diagonal of a matrix.

@jvdp1 Do you have any suggestion how to control the return type of eye? I don't think I can add an integer kind variable as the return type needs to be resolved at compile time. As a subroutine however it would work fine.


  • Specify which diagonal to use?

Yes, as you proposed should be fine. I guess that a negative number could be provided to extract elements below the diagonal?

From the numpy API for diag:
Diagonal in question. The default is 0. Use k>0 for diagonals above the main diagonal, and k<0 for diagonals below the main diagonal.

I noted now the Julia ˙Diagonal` is a bit different. Given a matrix A, it returns a matrix keeping only the elements on the diagonal (and setting the other values to zero)


  • How to deal with non-square matrices? Extract the diagonal along the shorter dimension? Run-time shape assertion?

I am not sure to understand the question. I think you could loop until min(size(x, 1), size(x,2)).

Your suggestion is equivalent to what I have now:

      function diag_rsp_mat(A) result(res)
        real(sp), intent(in) :: A(:,:)
        real(sp) :: res(minval(shape(A)))

        integer :: i
        do i = 1, minval(shape(A))
          res(i) = A(i,i)
        end do
      end function diag_rsp_mat

Until some form of assertion mechanism is available, I think this is the easiest solution. For the k dummy variable to select a specific diagonal it will be necessary to flag an error if abs(k) > minval(shape(A)) - 1.

@jvdp1
Copy link
Member

jvdp1 commented Apr 8, 2020

@jvdp1 Do you have any suggestion how to control the return type of eye? I don't think I can add an integer kind variable as the return type needs to be resolved at compile time. As a subroutine however it would work fine.

Would a stragegy like the one used for ieee_value be an option for eye?
E.g.,

real(sp) :: a
real(sp), allocatable :: b(:,:)

b= eye(a,5)

b= eye(1._sp, 5)

@milancurcic
Copy link
Member

Thanks for this proposal. I like it. Some comments:

  • API and function names look reasonable to me.
  • Regarding whether the result should be allocatable, only if it must, i.e. if it can't be determined from shape/size of input parameters. I don't see why the function results here need to be allocatable. However, it may need to be if we are to generalize to arbitrary ranks.
  • On functions vs. subroutines: My personal rule is always use functions unless a you need to return more than one result, or you need to make side effects.
  • I am also not the target user of these functions.

Specifically regarding eye:

eye

Do we need an optional kind dummy variable for the return type (eye(n [, kind]))? Or just default to > double precision?

Why different return types? It's just a bunch of ones and zeros of some desired shape. In the same vain, why even double precision result? Why not just integer? You can then use it directly in expressions and let the compiler promote it to the higher types or kinds if needed.

Integer-only result for eye() only becomes an issue if you want to pass it as an argument to a procedure that expects some other type or kind. Then you promote it explicitly.

@ivan-pi
Copy link
Member Author

ivan-pi commented Apr 8, 2020

Thank you both for your comments.

@jvdp1 Yes, that would be the option giving full control to the caller.

@milancurcic I like your solution with implicit type promotion as it simplifies the code.

If most of you agree with the implicit type promotion solution, then eye will simply be:

    integer function eye(n) result(res)
        integer, intent(in) :: n
        integer :: res(n,n)
        integer :: i
        res = 0
        do i = 1, n
            res(i,i) = 1
        end do
    end function

and can be used as to fill both fixed-size or allocatable arrays, and also appear in expressions:

    real(sp) :: a(0:5,0:5)
    real(dp), allocatable :: b(:,:)
    integer(int32), allocatable :: t(:,:)

    a = eye(6)
    b = eye(9)
    t = 3*eye(12)

    call random_number(a)
    a = a + eye(6)

This works fine with the ifort compiler and -warn all. With gfortran-9 and -Wall I get a bunch of warnings of the type test_eye.f90:30:0: Warning: ‘b.dim[0].lbound’ is used uninitialized in this function [-Wuninitialized] (it still works correctly). To prevent them it is necessary to follow Jeremie's approach above and with explicit type promotion:

    allocate(b,source=real(eye(9)))

@certik
Copy link
Member

certik commented Apr 8, 2020

What do you mean by "implicit type promotion"? I didn't get that part. I agree it should look like this:

    function eye(n) result(res)
        integer, intent(in) :: n
        integer :: res(n,n)
        integer :: i
        res = 0
        do i = 1, n
            res(i,i) = 1
        end do
    end function

(You had an extra integer before function, which is superfluous.)

@ivan-pi
Copy link
Member Author

ivan-pi commented Apr 8, 2020

Oops, thanks.

What I meant was we have only one copy of eye that returns an integer result, and allow the compiler to promote it to double, complex, int64, whatever, when used in an expression or upon assignment.

@certik
Copy link
Member

certik commented Apr 8, 2020

I see. I think that's fine.

@jvdp1
Copy link
Member

jvdp1 commented Apr 8, 2020

@ivan-pi I am ok with the implicit type promotion, as proposed.
Note: if we go for this solution, why not using 1-byte integer for the result res. This would limit the amount of memory needed by this function eye.

@milancurcic
Copy link
Member

With gfortran-9 and -Wall I get a bunch of warnings of the type test_eye.f90:30:0: Warning: ‘b.dim[0].lbound’ is used uninitialized in this function [-Wuninitialized] (it still works correctly).

I think this is a spurious warning. b is here used only on the LHS of the assignnment. It's fine.

I strongly recommend against putting workarounds in code only to suppress warnings. In this specific case, I don't think that the compiler is warning about anything reasonable, as far as I can tell.

why not using 1-byte integer for the result res. This would limit the amount of memory needed by this function eye.

Is int8 portable at this time? I don't know. If yes, I think we should do it. I can't think of any other potential downsides.

@ivan-pi ivan-pi added the in progress This proposal is being worked on label Apr 10, 2020
@ivan-pi
Copy link
Member Author

ivan-pi commented Apr 10, 2020

I've gone forward and made a draft pull request (#170), implementing most of your suggestions. 👍

@certik
Copy link
Member

certik commented Apr 10, 2020

@ivan-pi thank you for doing this! Looks very nice.

@certik
Copy link
Member

certik commented Apr 10, 2020

We can now use diag to construct a new matrix from the diagonal.

Would it make sense to also be able to assign to a diagonal of an existing matrix?

@jvdp1
Copy link
Member

jvdp1 commented Apr 10, 2020

Would it make sense to also be able to assign to a diagonal of an existing matrix?

I guess so. In Octave, I usually do that as:

D=diag(diag(A))

@ivan-pi
Copy link
Member Author

ivan-pi commented Apr 10, 2020

Would it make sense to also be able to assign to a diagonal of an existing matrix?

Wouldn't a subroutine be more suitable for this? Something along the lines of

subroutine set_diag(A,v,k)
real, intent(inout) :: A(:,:)
real, intent(in) :: v(minval(shape(A)) - abs(k))
integer, intent(in) :: k
end subroutine

@ivan-pi
Copy link
Member Author

ivan-pi commented Apr 10, 2020

I remembered there was a discussion on the topic of setting the diagonal over at j3-fortran/fortran_proposals#14

There was also a discussion for the identity matrix j3-fortran/fortran_proposals#103 with this interesting Fortran one-liner:

integer, parameter :: n = 4  ! dimension of the identity matrix
real(wp), parameter :: I(n,n) = RESHAPE([ (MERGE(1._wp, 0._wp, i/n==MOD(i,n)), i=0, n**2-1) ], [n,n])

@ivan-pi
Copy link
Member Author

ivan-pi commented Apr 10, 2020

We can now use diag to construct a new matrix from the diagonal.

Would it make sense to also be able to assign to a diagonal of an existing matrix?

Another way would be to have a function which returns the diagonal indices, like in numpy:

>>> di = np.diag_indices(4)
>>> di
(array([0, 1, 2, 3]), array([0, 1, 2, 3]))
>>> a = np.arange(16).reshape(4, 4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])
>>> a[di] = 100
>>> a
array([[100,   1,   2,   3],
       [  4, 100,   6,   7],
       [  8,   9, 100,  11],
       [ 12,  13,  14, 100]])

This could work nicely, if it was possible to slice into a rank-2 array as a rank-1 array.


Or perhaps a solution using the intrinsic unpack?

real :: A(3,3)
A = 0
A = unpack([1.,2.,3.],eye(3) == 1, A)

Hopefully a smart compiler would access only the values of A where eye(3) == 1 equals .true.

@ivan-pi ivan-pi added implementation Implementation in experimental and submission of a PR specification Discussion and iteration over the API and removed in progress This proposal is being worked on labels May 5, 2020
@ivan-pi
Copy link
Member Author

ivan-pi commented May 5, 2020

Hooray! With pull request #170 we have a new module for linear algebra. I've edited the labels to reflect the new status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implementation Implementation in experimental and submission of a PR specification Discussion and iteration over the API
Projects
None yet
Development

No branches or pull requests

4 participants