矩阵求导总结
基础部分基本法则根据$df=f’(x)dx$,我们定义向量微分和矩阵微分 df=\sum_{i=1}^n\frac{\partial f}{\partial x_i}dx_i=\frac{\partial f}{\partial\mathbf{x}}^Td\mathbf{x}=\frac{\partial f}{\partial \mathbf{x}}\cdot d\mathbf{x} df=\sum_{i=1}^{m}\sum_{j=1}^{n}\frac{\partial f}{\partial X_{ij}}dX_{ij}=\mathrm{tr}\left(\frac{\partial f}{\partial \mathbf{X}}^{T}d\mathbf{X}\right)=\mathbf{X}\cdot...