关于tf.nn.dynamic_rnn返回值详解

函数原型

tf.nn.dynamic_rnn(
  cell,
  inputs,
  sequence_length=None,
  initial_state=None,
  dtype=None,
  parallel_iterations=None,
  swap_memory=False,
  time_major=False,
  scope=None
)

实例讲解:

import tensorflow as tf
import numpy as np

n_steps = 2
n_inputs = 3
n_neurons = 5

X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
basic_cell = tf.contrib.rnn.BasicRNNCell(num_units=n_neurons)

seq_length = tf.placeholder(tf.int32, [None])
outputs, states = tf.nn.dynamic_rnn(basic_cell, X, dtype=tf.float32,
                  sequence_length=seq_length)

init = tf.global_variables_initializer()

X_batch = np.array([
    # step 0   step 1
    [[0, 1, 2], [9, 8, 7]], # instance 1
    [[3, 4, 5], [0, 0, 0]], # instance 2 (padded with zero vectors)
    [[6, 7, 8], [6, 5, 4]], # instance 3
    [[9, 0, 1], [3, 2, 1]], # instance 4
  ])
seq_length_batch = np.array([2, 1, 2, 2])

with tf.Session() as sess:
  init.run()
  outputs_val, states_val = sess.run(
    [outputs, states], feed_dict={X: X_batch, seq_length: seq_length_batch})
  print("outputs_val.shape:", outputs_val.shape, "states_val.shape:", states_val.shape)
  print("outputs_val:", outputs_val, "states_val:", states_val)

log info:

outputs_val.shape: (4, 2, 5) states_val.shape: (4, 5)
outputs_val:
[[[ 0.53073734 -0.61281306 -0.5437517  0.7320347 -0.6109526 ]
 [ 0.99996936 0.99990636 -0.9867181  0.99726075 -0.99999976]]

 [[ 0.9931584  0.5877845 -0.9100412  0.988892  -0.9982337 ]
 [ 0.     0.     0.     0.     0.    ]]

 [[ 0.99992317 0.96815354 -0.985101  0.9995968 -0.9999936 ]
 [ 0.99948144 0.9998127 -0.57493806 0.91015154 -0.99998355]]

 [[ 0.99999255 0.9998929  0.26732785 0.36024097 -0.99991137]
 [ 0.98875254 0.9922327  0.6505734  0.4732064 -0.9957567 ]]]
states_val:
 [[ 0.99996936 0.99990636 -0.9867181  0.99726075 -0.99999976]
 [ 0.9931584  0.5877845 -0.9100412  0.988892  -0.9982337 ]
 [ 0.99948144 0.9998127 -0.57493806 0.91015154 -0.99998355]
 [ 0.98875254 0.9922327  0.6505734  0.4732064 -0.9957567 ]]

首先输入X是一个 [batch_size,step,input_size] = [4,2,3] 的tensor,注意我们这里调用的是BasicRNNCell,只有一层循环网络,outputs是最后一层每个step的输出,它的结构是[batch_size,step,n_neurons] = [4,2,5],states是每一层的最后那个step的输出,由于本例中,我们的循环网络只有一个隐藏层,所以它就代表这一层的最后那个step的输出,因此它和step的大小是没有关系的,我们的X有4个样本组成,输出神经元大小n_neurons是5,因此states的结构就是[batch_size,n_neurons] = [4,5],最后我们观察数据,states的每条数据正好就是outputs的最后一个step的输出。

下面我们继续讲解多个隐藏层的情况,这里是三个隐藏层,注意我们这里仍然是调用BasicRNNCell

import tensorflow as tf
import numpy as np

n_steps = 2
n_inputs = 3
n_neurons = 5
n_layers = 3

X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
seq_length = tf.placeholder(tf.int32, [None])

layers = [tf.contrib.rnn.BasicRNNCell(num_units=n_neurons,
                   activation=tf.nn.relu)
     for layer in range(n_layers)]
multi_layer_cell = tf.contrib.rnn.MultiRNNCell(layers)
outputs, states = tf.nn.dynamic_rnn(multi_layer_cell, X, dtype=tf.float32, sequence_length=seq_length)

init = tf.global_variables_initializer()

X_batch = np.array([
    # step 0   step 1
    [[0, 1, 2], [9, 8, 7]], # instance 1
    [[3, 4, 5], [0, 0, 0]], # instance 2 (padded with zero vectors)
    [[6, 7, 8], [6, 5, 4]], # instance 3
    [[9, 0, 1], [3, 2, 1]], # instance 4
  ])

seq_length_batch = np.array([2, 1, 2, 2])

with tf.Session() as sess:
  init.run()
  outputs_val, states_val = sess.run(
    [outputs, states], feed_dict={X: X_batch, seq_length: seq_length_batch})
  print("outputs_val.shape:", outputs, "states_val.shape:", states)
  print("outputs_val:", outputs_val, "states_val:", states_val)

log info:

outputs_val.shape:
Tensor("rnn/transpose_1:0", shape=(?, 2, 5), dtype=float32) 

states_val.shape:
(<tf.Tensor 'rnn/while/Exit_3:0' shape=(?, 5) dtype=float32>,
 <tf.Tensor 'rnn/while/Exit_4:0' shape=(?, 5) dtype=float32>,
 <tf.Tensor 'rnn/while/Exit_5:0' shape=(?, 5) dtype=float32>)

outputs_val:
 [[[0.     0.     0.     0.     0.    ]
 [0.     0.18740742 0.     0.2997518 0.    ]]

 [[0.     0.07222144 0.     0.11551574 0.    ]
 [0.     0.     0.     0.     0.    ]]

 [[0.     0.13463384 0.     0.21534224 0.    ]
 [0.03702604 0.18443246 0.     0.34539366 0.    ]]

 [[0.     0.54511094 0.     0.8718864 0.    ]
 [0.5382122 0.     0.04396425 0.4040263 0.    ]]] 

states_val:
 (array([[0.    , 0.83723307, 0.    , 0.    , 2.8518028 ],
    [0.    , 0.1996038 , 0.    , 0.    , 1.5456247 ],
    [0.    , 1.1372368 , 0.    , 0.    , 0.832613 ],
    [0.    , 0.7904129 , 2.4675028 , 0.    , 0.36980057]],
   dtype=float32),
 array([[0.6524607 , 0.    , 0.    , 0.    , 0.    ],
    [0.25143963, 0.    , 0.    , 0.    , 0.    ],
    [0.5010576 , 0.    , 0.    , 0.    , 0.    ],
    [0.    , 0.3166597 , 0.4545995 , 0.    , 0.    ]],
   dtype=float32),
 array([[0.    , 0.18740742, 0.    , 0.2997518 , 0.    ],
    [0.    , 0.07222144, 0.    , 0.11551574, 0.    ],
    [0.03702604, 0.18443246, 0.    , 0.34539366, 0.    ],
    [0.5382122 , 0.    , 0.04396425, 0.4040263 , 0.    ]],
   dtype=float32))

我们说过,outputs是最后一层的输出,即 [batch_size,step,n_neurons] = [4,2,5]

states是每一层的最后一个step的输出,即三个结构为 [batch_size,n_neurons] = [4,5] 的tensor

继续观察数据,states中的最后一个array,正好是outputs的最后那个step的输出

下面我们继续讲当由BasicLSTMCell构造单元工厂的时候,只讲多层的情况,我们只需要将上面的BasicRNNCell替换成BasicLSTMCell就行了,打印信息如下:

outputs_val.shape:
Tensor("rnn/transpose_1:0", shape=(?, 2, 5), dtype=float32) 

states_val.shape:
(LSTMStateTuple(c=<tf.Tensor 'rnn/while/Exit_3:0' shape=(?, 5) dtype=float32>,
        h=<tf.Tensor 'rnn/while/Exit_4:0' shape=(?, 5) dtype=float32>),
LSTMStateTuple(c=<tf.Tensor 'rnn/while/Exit_5:0' shape=(?, 5) dtype=float32>,
        h=<tf.Tensor 'rnn/while/Exit_6:0' shape=(?, 5) dtype=float32>),
LSTMStateTuple(c=<tf.Tensor 'rnn/while/Exit_7:0' shape=(?, 5) dtype=float32>,
        h=<tf.Tensor 'rnn/while/Exit_8:0' shape=(?, 5) dtype=float32>))

outputs_val:
[[[1.2949290e-04 0.0000000e+00 2.7623639e-04 0.0000000e+00 0.0000000e+00]
 [9.4675866e-05 0.0000000e+00 2.0214770e-04 0.0000000e+00 0.0000000e+00]]

 [[4.3100454e-06 4.2123037e-07 1.4312843e-06 0.0000000e+00 0.0000000e+00]
 [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]]

 [[0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
 [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]]

 [[0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
 [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]]] 

states_val:
(LSTMStateTuple(
c=array([[0.    , 0.    , 0.04676079, 0.04284539, 0.    ],
    [0.    , 0.    , 0.0115245 , 0.    , 0.    ],
    [0.    , 0.    , 0.    , 0.    , 0.    ],
    [0.    , 0.    , 0.    , 0.    , 0.    ]],
   dtype=float32),
h=array([[0.    , 0.    , 0.00035096, 0.04284406, 0.    ],
    [0.    , 0.    , 0.00142574, 0.    , 0.    ],
    [0.    , 0.    , 0.    , 0.    , 0.    ],
    [0.    , 0.    , 0.    , 0.    , 0.    ]],
   dtype=float32)),
LSTMStateTuple(
c=array([[0.0000000e+00, 1.0477135e-02, 4.9871090e-03, 8.2785974e-04,
    0.0000000e+00],
    [0.0000000e+00, 2.3306280e-04, 0.0000000e+00, 9.9445322e-05,
    5.9535629e-05],
    [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
    0.0000000e+00],
    [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
    0.0000000e+00]], dtype=float32),
h=array([[0.00000000e+00, 5.23016974e-03, 2.47756205e-03, 4.11730434e-04,
    0.00000000e+00],
    [0.00000000e+00, 1.16522635e-04, 0.00000000e+00, 4.97301044e-05,
    2.97713632e-05],
    [0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
    0.00000000e+00],
    [0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
    0.00000000e+00]], dtype=float32)),
LSTMStateTuple(
c=array([[1.8937115e-04, 0.0000000e+00, 4.0442235e-04, 0.0000000e+00,
    0.0000000e+00],
    [8.6200516e-06, 8.4243663e-07, 2.8625946e-06, 0.0000000e+00,
    0.0000000e+00],
    [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
    0.0000000e+00],
    [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
    0.0000000e+00]], dtype=float32),
h=array([[9.4675866e-05, 0.0000000e+00, 2.0214770e-04, 0.0000000e+00,
    0.0000000e+00],
    [4.3100454e-06, 4.2123037e-07, 1.4312843e-06, 0.0000000e+00,
    0.0000000e+00],
    [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
    0.0000000e+00],
    [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
    0.0000000e+00]], dtype=float32)))

我们先看看LSTM单元的结构

如果您不查看框内的内容,LSTM单元看起来与常规单元格完全相同,除了它的状态分为两个向量:h(t)和c(t)。你可以将h(t)视为短期状态,将c(t)视为长期状态。

因此我们的states包含三个LSTMStateTuple,每一个表示每一层的最后一个step的输出,这个输出有两个信息,一个是h表示短期记忆信息,一个是c表示长期记忆信息。维度都是[batch_size,n_neurons] = [4,5],states的最后一个LSTMStateTuple中的h就是outputs的最后一个step的输出

以上这篇关于tf.nn.dynamic_rnn返回值详解就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持我们。

(0)

相关推荐

  • 双向RNN:bidirectional_dynamic_rnn()函数的使用详解

    双向RNN:bidirectional_dynamic_rnn()函数的使用详解 先说下为什么要使用到双向RNN,在读一篇文章的时候,上文提到的信息十分的重要,但这些信息是不足以捕捉文章信息的,下文隐含的信息同样会对该时刻的语义产生影响. 举一个不太恰当的例子,某次工作会议上,领导进行"简洁地"总结,他会在第一句告诉你:"下面,为了节约时间,我简单地说两点-",(-此处略去五百字-),"首先,-.",(-此处略去一万字-),"碍于时间的

  • 关于tf.nn.dynamic_rnn返回值详解

    函数原型 tf.nn.dynamic_rnn( cell, inputs, sequence_length=None, initial_state=None, dtype=None, parallel_iterations=None, swap_memory=False, time_major=False, scope=None ) 实例讲解: import tensorflow as tf import numpy as np n_steps = 2 n_inputs = 3 n_neuron

  • 对python3 中方法各种参数和返回值详解

    如下所示: # -*- coding:utf-8 -*- # Author: Evan Mi # 函数 def func1(): print('in the func1') return 0 # 过程 def func2(): print('in the func2') """ 多个值用逗号分割后返回,会分装到一个tuple中返回, 接收的时候,如果使用一个变量接收,那么这个接收变量就是一个tuple类型的 如果接收的时候也用逗号分割多个值来接收,那么可以分别对应返回tupl

  • 前端传参数进行Mybatis调用mysql存储过程执行返回值详解

    目录 查询数据库中的存储过程: 方法一: select `name` from mysql.proc where db = 'your_db_name' and `type`; = 'PROCEDURE' 方法二:  show procedure status; 你要先在数据库中建一个表,然后创建存储过程 我建的表a_tmp,存储过程名称bill_a_forbusiness 执行语句:  CALL bill_a_forbusiness(44,44,52,47,44,46,52,52,349171

  • js弹窗返回值详解(window.open方式)

    test.php 复制代码 代码如下: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-

  • C语言 用指针作为函数返回值详解

    C语言允许函数的返回值是一个指针(地址),我们将这样的函数称为指针函数.下面的例子定义了一个函数 strlong(),用来返回两个字符串中较长的一个: #include <stdio.h> #include <string.h> char *strlong(char *str1, char *str2){ if(strlen(str1) >= strlen(str2)){ return str1; }else{ return str2; } } int main(){ cha

  • C++ 常量成员常量返回值详解

    总结: 1.常量数据成员,形式:const Type m_tData; 1)常量数据成员,需要在构造函数列表中给出,构造函数中可以用常量赋值,也可以实例化的时候赋值. 2)赋值函数中不能赋值,起到保护常量数据成员的作用,和友元作用相反. 2.常量成员函数,形式:type funname(type1 arg1,type2 arg2,...) const 1)常量成员函数,不能修改类数据成员,不能调用非常量函数. 2)常量成员函数的作用,可以有效的将类的函数分为可以修改类的函数,和不能修改类的函数:

  • Java陷阱之慎用入参做返回值详解

    正常情况下,在Java中入参是不建议用做返回值的.除了造成代码不易理解.语义不清等问题外,可能还埋下了陷阱等你入坑. 问题背景 比如有这么一段代码: @Named public class AService { private SupplyAssignment localSupply = new SupplyAssignment(); @Inject private BService bervice; public List<Supply> calcSupplyAssignment() Lis

  • jquery ajax例子返回值详解

    在JQuery中,AJAX有三种实现方式:$.ajax() , $.post , $.get(). 首先我们看$.get(): 复制代码 代码如下: $.get("test.jsp", { name: "cssrain", time: "2008/01/21" }, //要传递的数据 function(data){ alert("返回的数据: " + data); } ) 然后看$.post(): 跟$.get()格式一样.

  • PHP $_FILES中error返回值详解

    $_FILES['file']['error']值 UPLOAD_ERR_OK: 0 //正常,上传成功 UPLOAD_ERR_INI_SIZE: 1 //上传文件大小超过服务器允许上传的最大值,php.ini中设置upload_max_filesize选项限制的值 UPLOAD_ERR_FORM_SIZE: 2 //上传文件大小超过HTML表单中隐藏域MAX_FILE_SIZE选项指定的值 UPLOAD_ERR_NO_TMP_DIR: 6 //没有找不到临时文件夹 UPLOAD_ERR_CAN

  • python循环神经网络RNN函数tf.nn.dynamic_rnn使用

    目录 学习前言 tf.nn.dynamic_rnn的定义 tf.nn.dynamic_rnn的使用举例 单层实验 多层实验 学习前言 已经完成了RNN网络的构建,但是我们对于RNN网络还有许多疑问,特别是tf.nn.dynamic_rnn函数,其具体的应用方式我们并不熟悉,查询了一下资料,我心里的想法是这样的. tf.nn.dynamic_rnn的定义 tf.nn.dynamic_rnn( cell, inputs, sequence_length=None, initial_state=Non

随机推荐