• Home
  • About
    • Shuai Aris Chen photo

      Shuai Aris Chen

      Random Walks in the Physics Universe

    • Learn More
  • Posts
    • All Posts
    • Tags Archive
    • Posts Archive
  • Projects
  • RSS

Generalized Inverse

07 May 2022

  • Generalized Inverse
    • Definition
    • Construction

Generalized Inverse

In practice, I come across a problem with taking a derivative of an eigenvector. Mathematically, with a matrix \(H(\alpha)\) and its eigenvector \(\mathbf v(\alpha)\),

\[H(\alpha )\mathbf v(\alpha )=E(\alpha)\mathbf v(\alpha), \label{eigeneq}\]

a question comes how to calculate the derivative \(\partial_\alpha\mathbf v(\alpha )\) efficiently. One naive method is to find \(v(\alpha+\delta \alpha )\), such that \(\partial_\alpha \mathbf v(\alpha)=[\mathbf v(\alpha+\delta\alpha)-\mathbf v(\alpha)]/\delta\alpha\) is obtained directly by the definition. Such a scheme generates errors and we have to determine which eigenvector is targeted for \(H(\alpha+\delta\alpha)\). When the \(H(\alpha)\) depends on plenty of parameters \(\alpha_1,\alpha_2,\cdots\), much effort is needed. As a matter of fact, we can reach our purpose by means of generalized inverse. We set out to take a derivative on both sides of Eq.~\((\ref{eigeneq})\)

\[\partial_\alpha H(\alpha)\mathbf v(\alpha)+H(\alpha)\partial_\alpha\mathbf v(\alpha)=\partial_\alpha E(\alpha)\mathbf v(\alpha)+E(\alpha) \partial_\alpha\mathbf v(\alpha),\]

or equivalently,

\[[E(\alpha)-H(\alpha)]\partial_\alpha\mathbf v(\alpha) = [\partial_\alpha H(\alpha)-\partial_\alpha E(\alpha)] \mathbf v(\alpha).\]

The matrix \(E(\alpha)-H(\alpha)\) is singular for \(E(\alpha)\) is just the eigenvalue, such that we fail to obtain \(\partial_\alpha\mathbf v(\alpha)\) by taking an inverse. In the sense of numerics, both \(\partial_\alpha H(\alpha)\) and \(\partial_\alpha E(\alpha)\) are more friendly to calculate.

Further, by observing

\[\partial_\alpha E(\alpha) = \mathbf v^\dagger (\alpha )\partial_\alpha H(\alpha)\mathbf v(\alpha).\]

The remained thing is to ‘Take inverse’ \([E(\alpha)-H(\alpha)]^-\).

\[\partial_\alpha \mathbf v(\alpha) = [E(\alpha)-H(\alpha)]^-[\partial_\alpha H(\alpha)-\mathbf v^\dagger (\alpha )\partial_\alpha H(\alpha)\mathbf v(\alpha)]\mathbf v(\alpha).\]

A way out is to the concept generalized inverse(g-inverse, pseudoinverse) or Moore-Penrose inverse.

Definition

Definition

A matrix \(A^g\in \mathbb R ^{n\times m }\) is a generalized inverse of a matrix \(A\in \mathbb R^{m\times n }\) if \(A A^gA=A\).

The purpose of constructing a generalized inverse of a matrix is to obtain a matrix that can serve as an inverse in some sense of a wider class of matrices than invertible matrices. A generalized inverse exists for an arbitrary matrix, and when a matrix has a regular inverse, this inverse is its unique generalized inverse.

With the generalized inverse, we can find a solution to a linear system \(Ax=y\). Take \(G\) as a generalized inverse of \(A\). Then

\[AGy=AGAx=Ax=y,\]

which yields \(x=Gy\) as a solution.

Construction

Given a matrix \(A\in \mathbf R^{m\times n }\), we can singular-value decompose it as

\[A= U \begin{pmatrix} \Sigma_1 & 0 \\ 0& 0 \end{pmatrix} V\]

then there exist matrices \(X, Y,Z\) such that

\[A^g = V \begin{pmatrix} \Sigma_1^{-1} & X \\ Y & Z \end{pmatrix} U^T\]

Conversely, any choice of \(X,Y\), and \(Z\) for the matrix of this form is a generalized inverse of \(A\). In particular, the pseudoinverse is given by \(X=Y=Z=0\)

\[A^+ = V \begin{pmatrix} \Sigma_1^{-1} & 0 \\ 0 & 0- \end{pmatrix} U^T\]

One can obtain pseudoinverse by a command PseudoInverse in Mathematica software.



Math