TensorFlow中tf.batch_matmul()的用法

2025-01-28 09:39:43

TensorFlow中tf.batch_matmul()用法

如果有两个三阶张量，size分别为

a.shape = [100, 3, 4]
b.shape = [100, 4, 5]
c = tf.batch_matmul(a, b)

则c.shape = [100, 3, 5] //将每一对 3x4 的矩阵与 4x5 的矩阵分别相乘。batch_size不变

100为张量的batch_size。剩下的两个维度为数据的维度。

不过新版的tensorflow已经移除了上面的函数，使用时换为tf.matmul就可以了。与上面注释的方式是同样的。

附：如果是更高维度。例如(a, b, m, n) 与(a, b, n, k)之间做matmul运算。则结果的维度为(a, b, m, k)。

TensorFlow如何实现batch_matmul

我们知道，在tensorflow早期版本中有tf.batch_matmul()函数，可以实现多维tensor和低维tensor的直接相乘，这在使用过程中非常便捷。

但是最新版本的tensorflow现在只有tf.matmul()函数可以使用，不过只能实现同维度的tensor相乘, 下面的几种方法可以实现batch matmul的可能。

例如: tensor A(batch_size,m,n), tensor B(n,k),实现batch matmul 使得A * B。

方法1: 利用tf.matmul()

对tensor B 进行增维和扩展

A = tf.Variable(tf.random_normal(shape=(batch_size, 2, 3)))
B = tf.Variable(tf.random_normal(shape=(3, 5)))
B_exp = tf.tile(tf.expand_dims(B,0),[batch_size, 1, 1]) #先进行增维再扩展
C = tf.matmul(A, B_exp)

方法2: 利用tf.reshape()

对tensor A 进行reshape操作，然后利用tf.matmul()

A = tf.Variable(tf.random_normal(shape=(batch_size, 2, 3)))
B = tf.Variable(tf.random_normal(shape=(3, 5)))
A = tf.reshape(A, [-1, 3])
C = tf.reshape(tf.matmul(A, B), [-1, 2, 5])

方法3: 利用tf.scan()

利用tf.scan() 对tensor按第0维进行展开的特性

A = tf.Variable(tf.random_normal(shape=(batch_size, 2, 3)))
B = tf.Variable(tf.random_normal(shape=(3, 5)))
initializer = tf.Variable(tf.random_normal(shape=(2,5)))
C = tf.scan(lambda a,x: tf.matmul(x, B), A, initializer)

方法4: 利用tf.einsum()

A = tf.Variable(tf.random_normal(shape=(batch_size, 2, 3)))
B = tf.Variable(tf.random_normal(shape=(3, 5)))
C = tf.einsum('ijk,kl->ijl',A,B)

以上为个人经验，希望能给大家一个参考，也希望大家多多支持我们。

将tf.batch_matmul替换成tf.matmul的实现

我就废话不多说了,大家还是直接看代码吧~ import tensorflow as tf h_doc=tf.placeholder(tf.int32,[None,30,512]) h_query=tf.placeholder(tf.int32,[None,10,512]) temp = tf.matmul(h_doc, h_query, adjoint_b = True) # tf.batch_matmul(h_doc, h_query, adj_y=True) print(temp.get_s
关于Tensorflow中的tf.train.batch函数的使用

这两天一直在看tensorflow中的读取数据的队列,说实话,真的是很难懂.也可能我之前没这方面的经验吧,最早我都使用的theano,什么都是自己写.经过这两天的文档以及相关资料,并且请教了国内的师弟.今天算是有点小感受了.简单的说,就是计算图是从一个管道中读取数据的,录入管道是用的现成的方法,读取也是.为了保证多线程的时候从一个管道读取数据不会乱吧,所以这种时候读取的时候需要线程管理的相关操作.今天我实验室了一个简单的操作,就是给一个有序的数据,看看读出来是不是有序的,结果发现是有序的,所以
浅谈keras中的batch_dot,dot方法和TensorFlow的matmul

概述在使用keras中的keras.backend.batch_dot和tf.matmul实现功能其实是一样的智能矩阵乘法,比如A,B,C,D,E,F,G,H,I,J,K,L都是二维矩阵,中间点表示矩阵乘法,AG 表示矩阵A 和G 矩阵乘法(A 的列维度等于G 行维度),WX=Z import keras.backend as K import tensorflow as tf import numpy as np w = K.variable(np.random.randint(10,siz
TensorFlow中tf.batch_matmul()的用法

TensorFlow中tf.batch_matmul()用法如果有两个三阶张量,size分别为 a.shape = [100, 3, 4] b.shape = [100, 4, 5] c = tf.batch_matmul(a, b) 则c.shape = [100, 3, 5] //将每一对 3x4 的矩阵与 4x5 的矩阵分别相乘.batch_size不变 100为张量的batch_size.剩下的两个维度为数据的维度. 不过新版的tensorflow已经移除了上面的函数,使用时换为tf.
Tensorflow中tf.ConfigProto()的用法详解

参考Tensorflow Machine Leanrning Cookbook tf.ConfigProto()主要的作用是配置tf.Session的运算方式,比如gpu运算或者cpu运算具体代码如下: import tensorflow as tf session_config = tf.ConfigProto( log_device_placement=True, inter_op_parallelism_threads=0, intra_op_parallelism_threads=0,
tensorflow中tf.slice和tf.gather切片函数的使用

tf.slice(input_, begin, size, name=None):按照指定的下标范围抽取连续区域的子集 tf.gather(params, indices, validate_indices=None, name=None):按照指定的下标集合从axis=0中抽取子集,适合抽取不连续区域的子集输出: input = [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]], [[5, 5, 5], [6, 6, 6]]] tf.slice(
浅谈tensorflow 中tf.concat()的使用

concat()是将tensor沿着指定维度连接起来.其中tensorflow1.3版中是这样定义的: concat(values,axis,name='concat') 一.对于2维来说,0表示行,1表示列 t1 = [[1, 2, 3], [4, 5, 6]] t2 = [[7, 8, 9], [10, 11, 12]] with tf.Session() as sess: print(sess.run(tf.concat([t1, t2], 0) )) 结果为:[[1, 2, 3], [4
对tensorflow中tf.nn.conv1d和layers.conv1d的区别详解

在用tensorflow做一维的卷积神经网络的时候会遇到tf.nn.conv1d和layers.conv1d这两个函数,但是这两个函数有什么区别呢,通过计算得到一些规律. 1.关于tf.nn.conv1d的解释,以下是Tensor Flow中关于tf.nn.conv1d的API注解: Computes a 1-D convolution given 3-D input and filter tensors. Given an input tensor of shape [batch, in_wi
tensorflow中tf.reduce_mean函数的使用

tf.reduce_mean 函数用于计算张量tensor沿着指定的数轴(tensor的某一维度)上的的平均值,主要用作降维或者计算tensor(图像)的平均值. reduce_mean(input_tensor, axis=None, keep_dims=False, name=None, reduction_indices=None) 第一个参数input_tensor: 输入的待降维的tensor; 第二个参数axis: 指定的轴,如果不指定,则计算所有元素的均值; 第三个参数keep_d
tensorflow中的数据类型dtype用法说明

Tensorflow中,主要有以下几种数据类型(dtype),在旧版本中,不用加tf也能使用. 有符号整型 tf.int8:8位整数. tf.int16:16位整数. tf.int32:32位整数. tf.int64:64位整数. 无符号整型 tf.uint8:8位无符号整数. tf.uint16:16位无符号整数. 浮点型 tf.float16:16位浮点数. tf.float32:32位浮点数. tf.float64:64位浮点数. tf.double:等同于tf.float64. 字符串型
Tensorflow中k.gradients()和tf.stop_gradient()用法说明

上周在实验室开荒某个代码,看到中间这么一段,对Tensorflow中的stop_gradient()还不熟悉,特此周末进行重新并总结. y = xx + K.stop_gradient(rounded - xx) 这代码最终调用位置在tensoflow.python.ops.gen_array_ops.stop_gradient(input, name=None),关于这段代码为什么这样写的意义在文末给出. [stop_gradient()意义] 用stop_gradient生成损失函数w.r.
浅谈tensorflow中几个随机函数的用法

如下所示: tf.constant(value, dtype=None, shape=None) 创建一个常量tensor,按照给出value来赋值,可以用shape来指定其形状.value可以是一个数,也可以是一个list. 如果是一个数,那么这个常亮中所有值的按该数来赋值. tf.random_normal(shape,mean=0.0,stddev=1.0,dtype=tf.float32) tf.truncated_normal(shape, mean=0.0, stddev=1.0,