Pytorch深度学习addmm()和addmm_()函数用法解析

2025-11-05 17:13:42

一、函数解释

在torch/_C/_VariableFunctions.py的有该定义，意义就是实现一下公式：

换句话说，就是需要传入5个参数，mat里的每个元素乘以beta，mat1和mat2进行矩阵乘法（左行乘右列）后再乘以alpha，最后将这2个结果加在一起。但是这样说可能没啥概念，接下来博主为大家写上一段代码，大家就明白了~

    def addmm(self, beta=1, mat, alpha=1, mat1, mat2, out=None): # real signature unknown; restored from __doc__
        """
        addmm(beta=1, mat, alpha=1, mat1, mat2, out=None) -> Tensor
        Performs a matrix multiplication of the matrices :attr:`mat1` and :attr:`mat2`.
        The matrix :attr:`mat` is added to the final result.
        If :attr:`mat1` is a :math:`(n \times m)` tensor, :attr:`mat2` is a
        :math:`(m \times p)` tensor, then :attr:`mat` must be
        :ref:`broadcastable <broadcasting-semantics>` with a :math:`(n \times p)` tensor
        and :attr:`out` will be a :math:`(n \times p)` tensor.
        :attr:`alpha` and :attr:`beta` are scaling factors on matrix-vector product between
        :attr:`mat1` and :attr`mat2` and the added matrix :attr:`mat` respectively.
        .. math::
            out = \beta\ mat + \alpha\ (mat1_i \mathbin{@} mat2_i)
        For inputs of type `FloatTensor` or `DoubleTensor`, arguments :attr:`beta` and
        :attr:`alpha` must be real numbers, otherwise they should be integers.
        Args:
            beta (Number, optional): multiplier for :attr:`mat` (:math:`\beta`)
            mat (Tensor): matrix to be added
            alpha (Number, optional): multiplier for :math:`mat1 @ mat2` (:math:`\alpha`)
            mat1 (Tensor): the first matrix to be multiplied
            mat2 (Tensor): the second matrix to be multiplied
            out (Tensor, optional): the output tensor
        Example::
            >>> M = torch.randn(2, 3)
            >>> mat1 = torch.randn(2, 3)
            >>> mat2 = torch.randn(3, 3)
            >>> torch.addmm(M, mat1, mat2)
            tensor([[-4.8716,  1.4671, -1.3746],
                    [ 0.7573, -3.9555, -2.8681]])
        """
        pass

二、代码范例

1.先摆出代码，大家可以先复制粘贴运行一下，在之后博主会一一讲解

"""
@author:nickhuang1996
"""
import torch
rectangle_height = 3
rectangle_width = 3
inputs = torch.randn(rectangle_height, rectangle_width)
for i in range(rectangle_height):
    for j in range(rectangle_width):
        inputs[i] = i * torch.ones(rectangle_width)
'''
inputs and its transpose
-->inputs   =   tensor([[0., 0., 0.],
                        [1., 1., 1.],
                        [2., 2., 2.]])
-->inputs_t =   tensor([[0., 1., 2.],
                        [0., 1., 2.],
                        [0., 1., 2.]])
'''
print("inputs:\n", inputs)
inputs_t = inputs.t()
print("inputs_t:\n", inputs_t)
'''
inputs_t @ inputs_t    [[0., 1., 2.],       [[0., 1., 2.],          [[0., 3., 6.]
                    =   [0., 1., 2.],   @    [0., 1., 2.],     =     [0., 3., 6.]
                        [0., 1., 2.]]        [0., 1., 2.]]           [0., 3., 6.]]
'''
'''a, b, c and d = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
a = torch.addmm(input=inputs, mat1=inputs_t, mat2=inputs_t)
b = inputs.addmm(mat1=inputs_t, mat2=inputs_t)
c = torch.addmm(input=inputs, beta=1, mat1=inputs_t, mat2=inputs_t, alpha=1)
d = inputs.addmm(beta=1, mat1=inputs_t, mat2=inputs_t, alpha=1)
'''e and f = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
e = torch.addmm(inputs, inputs_t, inputs_t)
f = inputs.addmm(inputs_t, inputs_t)
'''1 * inputs + 1 * (inputs_t @ inputs_t)'''
g = inputs.addmm(1, inputs_t, inputs_t)
'''2 * inputs + 1 * (inputs_t @ inputs_t)'''
g2 = inputs.addmm(2, inputs_t, inputs_t)
'''h = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
h = inputs.addmm(1, 1, inputs_t, inputs_t)
'''h12 = 1 * inputs + 2 * (inputs_t @ inputs_t)'''
h12 = inputs.addmm(1, 2, inputs_t, inputs_t)
'''h21 = 2 * inputs + 1 * (inputs_t @ inputs_t)'''
h21 = inputs.addmm(2, 1, inputs_t, inputs_t)
print("a:\n", a)
print("b:\n", b)
print("c:\n", c)
print("d:\n", d)
print("e:\n", e)
print("f:\n", f)
print("g:\n", g)
print("g2:\n", g2)
print("h:\n", h)
print("h12:\n", h12)
print("h21:\n", h21)
print("inputs:\n", inputs)
'''inputs = 1 * inputs - 2 * (inputs @ inputs_t)'''
'''
inputs @ inputs_t       [[0., 0., 0.],       [[0., 1., 2.],          [[0., 0., 0.]
                    =    [1., 1., 1.],   @    [0., 1., 2.],     =     [0., 3., 6.]
                         [2., 2., 2.]]        [0., 1., 2.]]           [0., 6., 12.]]
'''
inputs.addmm_(1, -2, inputs, inputs_t)  # In-place
print("inputs:\n", inputs)

2.其中

inputs是一个3×3的矩阵，为

tensor([[0., 0., 0.],
        [1., 1., 1.],
        [2., 2., 2.]])

inputs_t也是一个3×3的矩阵，是inputs的转置矩阵，为

tensor([[0., 1., 2.],
        [0., 1., 2.],
        [0., 1., 2.]])

* inputs_t @ inputs_t为

'''
inputs_t @ inputs_t    [[0., 1., 2.],       [[0., 1., 2.],          [[0., 3., 6.]
                    =   [0., 1., 2.],   @    [0., 1., 2.],     =     [0., 3., 6.]
                        [0., 1., 2.]]        [0., 1., 2.]]           [0., 3., 6.]]
'''

3.代码中a，b，c和d展示的是完全形式，即标明了位置参数和传入参数。可以看到input这个位置参数可以写在函数的前面，即

torch.addmm(input, mat1, mat2) = inputs.addmm(mat1, mat2)

完成的公式为：

1 × inputs + 1 ×（inputs_t @ inputs_t）

'''a, b, c and d = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
a = torch.addmm(input=inputs, mat1=inputs_t, mat2=inputs_t)
b = inputs.addmm(mat1=inputs_t, mat2=inputs_t)
c = torch.addmm(input=inputs, beta=1, mat1=inputs_t, mat2=inputs_t, alpha=1)
d = inputs.addmm(beta=1, mat1=inputs_t, mat2=inputs_t, alpha=1)

a:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
b:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
c:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
d:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])

4.下面的例子更好了说明了input参数的位置可变性，并且beta和alpha都缺省了：

完成的公式为：

1 × inputs + 1 ×（inputs_t @ inputs_t）

'''e and f = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
e = torch.addmm(inputs, inputs_t, inputs_t)
f = inputs.addmm(inputs_t, inputs_t)

e:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
f:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])

5.加一个参数，实际上是添加了beta这个参数

完成的公式为：

g = 1 × inputs + 1 ×（inputs_t @ inputs_t）

g2 = 2 × inputs + 1 ×（inputs_t @ inputs_t）

'''1 * inputs + 1 * (inputs_t @ inputs_t)'''
g = inputs.addmm(1, inputs_t, inputs_t)
'''2 * inputs + 1 * (inputs_t @ inputs_t)'''
g2 = inputs.addmm(2, inputs_t, inputs_t)

g:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
g2:
tensor([[ 0.,  3.,  6.],
        [ 2.,  5.,  8.],
        [ 4.,  7., 10.]])

6.再加一个参数，实际上是添加了alpha这个参数

完成的公式为：

h = 1 × inputs + 1 ×（inputs_t @ inputs_t）

h12 = 1 × inputs + 2 ×（inputs_t @ inputs_t）

h21 = 2 × inputs + 1 ×（inputs_t @ inputs_t）

'''h = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
h = inputs.addmm(1, 1, inputs_t, inputs_t)
'''h12 = 1 * inputs + 2 * (inputs_t @ inputs_t)'''
h12 = inputs.addmm(1, 2, inputs_t, inputs_t)
'''h21 = 2 * inputs + 1 * (inputs_t @ inputs_t)'''
h21 = inputs.addmm(2, 1, inputs_t, inputs_t)

h:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
h12:
tensor([[ 0.,  6., 12.],
        [ 1.,  7., 13.],
        [ 2.,  8., 14.]])
h21:
tensor([[ 0.,  3.,  6.],
        [ 2.,  5.,  8.],
        [ 4.,  7., 10.]])

7.当然，以上的步骤inputs没有变化，还是为

inputs:
tensor([[0., 0., 0.],
        [1., 1., 1.],
        [2., 2., 2.]])

8.addmm_()的操作和addmm()函数功能相同，区别就是addmm_()有inplace的操作，也就是在原对象基础上进行修改，即把改变之后的变量再赋给原来的变量。例如：

inputs的值变成了改变之后的值，不用再去写某个变量=addmm_() 了，因为inputs就是改变之后的变量！

*inputs@ inputs_t为

'''
inputs @ inputs_t       [[0., 0., 0.],       [[0., 1., 2.],          [[0., 0., 0.]
                    =    [1., 1., 1.],   @    [0., 1., 2.],     =     [0., 3., 6.]
                         [2., 2., 2.]]        [0., 1., 2.]]           [0., 6., 12.]]
'''

完成的公式为：

inputs = 1 × inputs - 2 ×（inputs @ inputs_t）

'''inputs = 1 * inputs - 2 * (inputs @ inputs_t)'''
inputs.addmm_(1, -2, inputs, inputs_t)  # In-place

inputs:
tensor([[  0.,   0.,   0.],
        [  1.,  -5., -11.],
        [  2., -10., -22.]])

三、代码运行结果

inputs:
tensor([[0., 0., 0.],
        [1., 1., 1.],
        [2., 2., 2.]])
inputs_t:
tensor([[0., 1., 2.],
        [0., 1., 2.],
        [0., 1., 2.]])
a:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
b:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
c:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
d:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
e:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
f:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
g:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
g2:
tensor([[ 0.,  3.,  6.],
        [ 2.,  5.,  8.],
        [ 4.,  7., 10.]])
h:
tensor([[0., 3., 6.],
        [1., 4., 7.],
        [2., 5., 8.]])
h12:
tensor([[ 0.,  6., 12.],
        [ 1.,  7., 13.],
        [ 2.,  8., 14.]])
h21:
tensor([[ 0.,  3.,  6.],
        [ 2.,  5.,  8.],
        [ 4.,  7., 10.]])
inputs:
tensor([[0., 0., 0.],
        [1., 1., 1.],
        [2., 2., 2.]])
inputs:
tensor([[  0.,   0.,   0.],
        [  1.,  -5., -11.],
        [  2., -10., -22.]])

以上就是Pytorch中addmm()和addmm_()函数用法解析的详细内容，更多关于Pytorch函数addmm() addmm_()的资料请关注我们其它相关文章！

pytorch中permute()函数用法补充说明(矩阵维度变化过程)

目录一.前言二.举例解释 1.permute(0,1,2) 2.permute(0,1,2) ⇒ permute(0,2,1) 3.permute(0,2,1) ⇒ permute(1,0,2) 4.permute(1,0,2) ⇒ permute(0,2,1) 三.写在最后一.前言之前写了篇torch中permute()函数用法文章,在详细的说一下permute函数里维度变化的详细过程非常感谢@m0_46225327对本文案例更加细节补充注意: 本文是这篇torch中permute
pytorch中的 .view()函数的用法介绍

目录一.普通用法(手动调整size) 二.特殊用法:参数-1(自动调整size) 一.普通用法 (手动调整size) view()相当于reshape.resize,重新调整Tensor的形状. import torch a1 = torch.arange(0,16) print(a1) # tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) a2 = a1.view(8, 2) a3 = a1.vi
Pytorch上下采样函数之F.interpolate数组采样操作详解

目录什么是上采样 F.interpolate——数组采样操作输入: 注意: 补充: 代码案例一般用法 size与scale_factor的区别:输入序列时 size与scale_factor的区别:输入整数时 align_corners=True与False的区别扩展: 总结什么是上采样上采样,在深度学习框架中,可以简单的理解为任何可以让你的图像变成更高分辨率的技术. 最简单的方式是重采样和插值:将输入图片input image进行rescale到一个想要的尺寸,而且计算每个点的像素
pytorch中permute()函数用法实例详解

目录前言三维情况变化一:不改变任何参数变化二:1与2交换变化三:0与1交换变化四:0与2交换变化五:0与1交换,1与2交换变化六:0与1交换,0与2交换总结前言本文只讨论二维三维中的permute用法最近的Attention学习中的一个permute函数让我不理解这个光说太抽象我就结合代码与图片解释一下首先创建一个三维数组小实例 import torch x = torch.linspace(1, 30, steps=30).view(3,2,5) # 设置一个三维
python神经网络Pytorch中Tensorboard函数使用

目录所需库的安装常用函数功能 1.SummaryWriter() 2.writer.add_graph() 3.writer.add_scalar() 4.tensorboard --logdir= 示例代码所需库的安装很多人问Pytorch要怎么可视化,于是决定搞一篇. tensorboardX==2.0 tensorflow==1.13.2 由于tensorboard原本是在tensorflow里面用的,所以需要装一个tensorflow.会自带一个tensorboard. 也可以不
pytorch中的torch.nn.Conv2d()函数图文详解

目录一.官方文档介绍二.torch.nn.Conv2d()函数详解参数dilation——扩张卷积(也叫空洞卷积) 参数groups——分组卷积总结一.官方文档介绍官网 nn.Conv2d:对由多个输入平面组成的输入信号进行二维卷积二.torch.nn.Conv2d()函数详解参数详解 torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1,
Pytorch深度学习addmm()和addmm_()函数用法解析

目录一.函数解释二.代码范例三.代码运行结果一.函数解释在torch/_C/_VariableFunctions.py的有该定义,意义就是实现一下公式: 换句话说,就是需要传入5个参数,mat里的每个元素乘以beta,mat1和mat2进行矩阵乘法(左行乘右列)后再乘以alpha,最后将这2个结果加在一起.但是这样说可能没啥概念,接下来博主为大家写上一段代码,大家就明白了~ def addmm(self, beta=1, mat, alpha=1, mat1, mat2, out=No
pyTorch深度学习softmax实现解析

目录用PyTorch实现linear模型模拟数据集定义模型加载数据集 optimizer 模型训练 softmax回归模型 Fashion-MNIST cross_entropy 模型的实现利用PyTorch简易实现softmax 用PyTorch实现linear模型模拟数据集 num_inputs = 2 #feature number num_examples = 1000 #训练样本个数 true_w = torch.tensor([[2],[-3.4]]) #真实的权重值 t
pyTorch深度学习多层感知机的实现

目录激活函数多层感知机的PyTorch实现激活函数前两节实现的传送门 pyTorch深度学习softmax实现解析 pyTorch深入学习梯度和Linear Regression实现析前两节实现的linear model 和 softmax model 是单层神经网络,只包含一个输入层和一个输出层,因为输入层不对数据进行transformation,所以只算一层输出层. 多层感知机(mutilayer preceptron)加入了隐藏层,将神经网络的层级加深,因为线性层的串联结果还是线
PyTorch深度学习模型的保存和加载流程详解

一.模型参数的保存和加载 torch.save(module.state_dict(), path):使用module.state_dict()函数获取各层已经训练好的参数和缓冲区,然后将参数和缓冲区保存到path所指定的文件存放路径(常用文件格式为.pt..pth或.pkl). torch.nn.Module.load_state_dict(state_dict):从state_dict中加载参数和缓冲区到Module及其子类中 . torch.nn.Module.state_dict()函数
Pytorch深度学习经典卷积神经网络resnet模块训练

目录前言一.resnet 二.resnet网络结构三.resnet18 1.导包 2.残差模块 2.通道数翻倍残差模块 3.rensnet18模块 4.数据测试 5.损失函数,优化器 6.加载数据集,数据增强 7.训练数据 8.保存模型 9.加载测试集数据,进行模型测试四.resnet深层对比前言随着深度学习的不断发展,从开山之作Alexnet到VGG,网络结构不断优化,但是在VGG网络研究过程中,人们发现随着网络深度的不断提高,准确率却没有得到提高,如图所示: 人们觉得深度学习到此
Python学习笔记之自定义函数用法详解

本文实例讲述了Python学习笔记之自定义函数用法.分享给大家供大家参考,具体如下: 函数能提高应用的模块性,和代码的重复利用率.Python提供了许多内建函数,比如print()等.也可以创建用户自定义函数. 函数定义函数定义的简单规则: 函数代码块以def关键词开头,后接函数标识符名称和圆括号(),任何传入参数和自变量必须放在圆括号中间函数内容以冒号起始,并且缩进若有返回值,Return[expression] 结束函数:不带return 表达式相当于返回None 函数通常使用三个单引
Python基础学习之时间转换函数用法详解

本文实例讲述了Python基础学习之时间转换函数用法.分享给大家供大家参考,具体如下: 前言 python的时间格式分为多种,几种格式之间的转换方法时常是我们遇到的而且是经常忘记的点,python不像php,时间字符串和datetime是一起的,只需要strtotime和date函数就可以相互转化.虽然网上已经有很多python时间转换的文章,但是由于作者本人经常做海外业务,需要各种时区之间的转换,所以这篇文章会对按时区转换各种时间格式做一个总结. 转换方法图示(图片转自网络): 一.字符串转时
Pytorch深度学习gather一些使用问题解决方案

目录问题场景描述问题的思考 gather的说明问题的解决问题场景描述我在复现Faster-RCNN模型的过程中遇到这样一个问题: 有一个张量,它的形状是 (128, 21, 4) roi_loc.shape = (128, 21, 4) 与之对应的还有一个label数据 gt_label.shape = (128) 我现在的需求是将label当作第一个张量在dim=1上的索引,将其中的数据拿出来. 具体来说就是,现在有128个样本数据,每个样本中有21个长度为4的向量.label也是1
Pytorch深度学习之实现病虫害图像分类

目录一.pytorch框架 1.1.概念 1.2.机器学习与深度学习的区别 1.3.在python中导入pytorch成功截图二.数据集三.代码复现 3.1.导入第三方库 3.2.CNN代码 3.3.测试代码四.训练结果 4.1.LOSS损失函数 4.2. ACC 4.3.单张图片识别准确率四.小结一.pytorch框架 1.1.概念 PyTorch是一个开源的Python机器学习库,基于Torch,用于自然语言处理等应用程序. 2017年1月,由Facebook人工智能研究院(FA
PyTorch深度学习LSTM从input输入到Linear输出

目录 LSTM介绍 LSTM参数 Inputs Outputs batch_first 案例 LSTM介绍关于LSTM的具体原理,可以参考: https://www.jb51.net/article/178582.htm https://www.jb51.net/article/178423.htm 系列文章: PyTorch搭建双向LSTM实现时间序列负荷预测 PyTorch搭建LSTM实现多变量多步长时序负荷预测 PyTorch搭建LSTM实现多变量时序负荷预测 PyTorch搭建LSTM

Pytorch深度学习addmm()和addmm_()函数用法解析

目录

一、函数解释

二、代码范例

三、代码运行结果

相关推荐

随机推荐