Keras—embedding嵌入层的用法详解

2025-06-28 11:26:01

最近在工作中进行了NLP的内容，使用的还是Keras中embedding的词嵌入来做的。

Keras中embedding层做一下介绍。

中文文档地址：https://keras.io/zh/layers/embeddings/

参数如下：

其中参数重点有input_dim,output_dim,非必选参数input_length.

初始化方法参数设置后面会单独总结一下。

demo使用预训练（使用百度百科（word2vec）的语料库）参考

embedding使用的demo参考：

def create_embedding(word_index, num_words, word2vec_model):
 embedding_matrix = np.zeros((num_words, EMBEDDING_DIM))
 for word, i in word_index.items():
  try:
   embedding_vector = word2vec_model[word]
   embedding_matrix[i] = embedding_vector
  except:
   continue
 return embedding_matrix

#word_index:词典（统计词转换为索引）
#num_word:词典长度+1
#word2vec_model:词向量的model

加载词向量model的方法：

def pre_load_embedding_model(model_file):
 # model = gensim.models.Word2Vec.load(model_file)
 # model = gensim.models.Word2Vec.load(model_file,binary=True)
 model = gensim.models.KeyedVectors.load_word2vec_format(model_file)
 return model

model中Embedding层的设置（注意参数，Input层的输入，初始化方法）：

 embedding_matrix = create_embedding(word_index, num_words, word2vec_model)

 embedding_layer = Embedding(num_words,
        EMBEDDING_DIM,
        embeddings_initializer=Constant(embedding_matrix),
        input_length=MAX_SEQUENCE_LENGTH,
        trainable=False)
 sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
 embedded_sequences = embedding_layer(sequence_input)

embedding层的初始化设置

keras embeding设置初始值的两种方式

随机初始化Embedding

from keras.models import Sequential
from keras.layers import Embedding
import numpy as np

model = Sequential()
model.add(Embedding(1000, 64, input_length=10))
# the model will take as input an integer matrix of size (batch, input_length).
# the largest integer (i.e. word index) in the input should be no larger than 999 (vocabulary size).
# now model.output_shape == (None, 10, 64), where None is the batch dimension.

input_array = np.random.randint(1000, size=(32, 10))

model.compile('rmsprop', 'mse')
output_array = model.predict(input_array)
print(output_array)
assert output_array.shape == (32, 10, 64)

使用weights参数指明embedding初始值

import numpy as np
import keras

m = keras.models.Sequential()
"""
可以通过weights参数指定初始的weights参数
因为Embedding层是不可导的
梯度东流至此回,所以把embedding放在中间层是没有意义的,emebedding只能作为第一层
注意weights到embeddings的绑定过程很复杂，weights是一个列表
"""
embedding = keras.layers.Embedding(input_dim=3, output_dim=2, input_length=1, weights=[np.arange(3 * 2).reshape((3, 2))], mask_zero=True)
m.add(embedding) # 一旦add，就会自动调用embedding的build函数,
print(keras.backend.get_value(embedding.embeddings))
m.compile(keras.optimizers.RMSprop(), keras.losses.mse)
print(m.predict([1, 2, 2, 1, 2, 0]))
print(m.get_layer(index=0).get_weights())
print(keras.backend.get_value(embedding.embeddings))

给embedding设置初始值的第二种方式：使用initializer

import numpy as np
import keras

m = keras.models.Sequential()
"""
可以通过weights参数指定初始的weights参数
因为Embedding层是不可导的
梯度东流至此回,所以把embedding放在中间层是没有意义的,emebedding只能作为第一层
给embedding设置权值的第二种方式，使用constant_initializer
"""
embedding = keras.layers.Embedding(input_dim=3, output_dim=2, input_length=1, embeddings_initializer=keras.initializers.constant(np.arange(3 * 2, dtype=np.float32).reshape((3, 2))))
m.add(embedding)
print(keras.backend.get_value(embedding.embeddings))
m.compile(keras.optimizers.RMSprop(), keras.losses.mse)
print(m.predict([1, 2, 2, 1, 2]))
print(m.get_layer(index=0).get_weights())
print(keras.backend.get_value(embedding.embeddings))

关键的难点在于理清weights是怎么传入到embedding.embeddings张量里面去的。

Embedding是一个层，继承自Layer，Layer有weights参数，weights参数是一个list，里面的元素都是numpy数组。在调用Layer的构造函数的时候，weights参数就被存储到了_initial_weights变量

basic_layer.py 之Layer类

  if 'weights' in kwargs:
   self._initial_weights = kwargs['weights']
  else:
   self._initial_weights = None

当把Embedding层添加到模型中、跟模型的上一层进行拼接的时候，会调用layer(上一层)函数，此处layer是Embedding实例，Embedding是一个继承了Layer的类，Embedding类没有重写__call__()方法，Layer实现了__call__()方法。

父类Layer的__call__方法调用子类的call()方法来获取结果。

所以最终调用的是Layer.__call__()。在这个方法中，会自动检测该层是否build过（根据self.built布尔变量）。

Layer.__call__函数非常重要。

 def __call__(self, inputs, **kwargs):
  """Wrapper around self.call(), for handling internal references.
  If a Keras tensor is passed:
   - We call self._add_inbound_node().
   - If necessary, we `build` the layer to match
    the _keras_shape of the input(s).
   - We update the _keras_shape of every input tensor with
    its new shape (obtained via self.compute_output_shape).
    This is done as part of _add_inbound_node().
   - We update the _keras_history of the output tensor(s)
    with the current layer.
    This is done as part of _add_inbound_node().
  # Arguments
   inputs: Can be a tensor or list/tuple of tensors.
   **kwargs: Additional keyword arguments to be passed to `call()`.
  # Returns
   Output of the layer's `call` method.
  # Raises
   ValueError: in case the layer is missing shape information
    for its `build` call.
  """
  if isinstance(inputs, list):
   inputs = inputs[:]
  with K.name_scope(self.name):
   # Handle laying building (weight creating, input spec locking).
   if not self.built:#如果未曾build，那就要先执行build再调用call函数
    # Raise exceptions in case the input is not compatible
    # with the input_spec specified in the layer constructor.
    self.assert_input_compatibility(inputs)

    # Collect input shapes to build layer.
    input_shapes = []
    for x_elem in to_list(inputs):
     if hasattr(x_elem, '_keras_shape'):
      input_shapes.append(x_elem._keras_shape)
     elif hasattr(K, 'int_shape'):
      input_shapes.append(K.int_shape(x_elem))
     else:
      raise ValueError('You tried to call layer "' +
           self.name +
           '". This layer has no information'
           ' about its expected input shape, '
           'and thus cannot be built. '
           'You can build it manually via: '
           '`layer.build(batch_input_shape)`')
    self.build(unpack_singleton(input_shapes))
    self.built = True#这句话其实有些多余，因为self.build函数已经把built置为True了

    # Load weights that were specified at layer instantiation.
    if self._initial_weights is not None:#如果传入了weights，把weights参数赋值到每个变量，此处会覆盖上面的self.build函数中的赋值。
     self.set_weights(self._initial_weights)

   # Raise exceptions in case the input is not compatible
   # with the input_spec set at build time.
   self.assert_input_compatibility(inputs)

   # Handle mask propagation.
   previous_mask = _collect_previous_mask(inputs)
   user_kwargs = copy.copy(kwargs)
   if not is_all_none(previous_mask):
    # The previous layer generated a mask.
    if has_arg(self.call, 'mask'):
     if 'mask' not in kwargs:
      # If mask is explicitly passed to __call__,
      # we should override the default mask.
      kwargs['mask'] = previous_mask
   # Handle automatic shape inference (only useful for Theano).
   input_shape = _collect_input_shape(inputs)

   # Actually call the layer,
   # collecting output(s), mask(s), and shape(s).
   output = self.call(inputs, **kwargs)
   output_mask = self.compute_mask(inputs, previous_mask)

   # If the layer returns tensors from its inputs, unmodified,
   # we copy them to avoid loss of tensor metadata.
   output_ls = to_list(output)
   inputs_ls = to_list(inputs)
   output_ls_copy = []
   for x in output_ls:
    if x in inputs_ls:
     x = K.identity(x)
    output_ls_copy.append(x)
   output = unpack_singleton(output_ls_copy)

   # Inferring the output shape is only relevant for Theano.
   if all([s is not None
     for s in to_list(input_shape)]):
    output_shape = self.compute_output_shape(input_shape)
   else:
    if isinstance(input_shape, list):
     output_shape = [None for _ in input_shape]
    else:
     output_shape = None

   if (not isinstance(output_mask, (list, tuple)) and
     len(output_ls) > 1):
    # Augment the mask to match the length of the output.
    output_mask = [output_mask] * len(output_ls)

   # Add an inbound node to the layer, so that it keeps track
   # of the call and of all new variables created during the call.
   # This also updates the layer history of the output tensor(s).
   # If the input tensor(s) had not previous Keras history,
   # this does nothing.
   self._add_inbound_node(input_tensors=inputs,
         output_tensors=output,
         input_masks=previous_mask,
         output_masks=output_mask,
         input_shapes=input_shape,
         output_shapes=output_shape,
         arguments=user_kwargs)

   # Apply activity regularizer if any:
   if (hasattr(self, 'activity_regularizer') and
     self.activity_regularizer is not None):
    with K.name_scope('activity_regularizer'):
     regularization_losses = [
      self.activity_regularizer(x)
      for x in to_list(output)]
    self.add_loss(regularization_losses,
        inputs=to_list(inputs))
  return output

如果没有build过，会自动调用Embedding类的build()函数。Embedding.build()这个函数并不会去管weights，如果它使用的initializer没有传入，self.embeddings_initializer会变成随机初始化。

如果传入了，那么在这一步就能够把weights初始化好。

如果同时传入embeddings_initializer和weights参数，那么weights参数稍后会把Embedding#embeddings覆盖掉。

embedding.py Embedding类的build函数

 def build(self, input_shape):
  self.embeddings = self.add_weight(
   shape=(self.input_dim, self.output_dim),
   initializer=self.embeddings_initializer,
   name='embeddings',
   regularizer=self.embeddings_regularizer,
   constraint=self.embeddings_constraint,
   dtype=self.dtype)
  self.built = True

综上，在keras中，使用weights给Layer的变量赋值是一个比较通用的方法，但是不够直观。keras鼓励多多使用明确的initializer，而尽量不要触碰weights。

以上这篇Keras—embedding嵌入层的用法详解就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持我们。

Keras框架中的epoch、bacth、batch size、iteration使用介绍

1.epoch Keras官方文档中给出的解释是:"简单说,epochs指的就是训练过程接中数据将被"轮"多少次" (1)释义: 训练过程中当一个完整的数据集通过了神经网络一次并且返回了一次,这个过程称为一个epoch,网络会在每个epoch结束时报告关于模型学习进度的调试信息. (2)为什么要训练多个epoch,即数据要被"轮"多次在神经网络中传递完整的数据集一次是不够的,对于有限的数据集(是在批梯度下降情况下),使用一个迭代过程,更新权重一
keras 使用Lambda 快速新建层添加多个参数操作

keras许多简单操作,都需要新建一个层,使用Lambda可以很好完成需求. # 额外参数 def normal_reshape(x, shape): return K.reshape(x,shape) output = Lambda(normal_reshape, arguments={'shape':(-1, image_seq, 1000)})(output) output = Lambda(lambda inp: K.mean(inp, axis=1), output_shape=(10
Keras 使用 Lambda层详解

我就废话不多说了,大家还是直接看代码吧! from tensorflow.python.keras.models import Sequential, Model from tensorflow.python.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Dropout, Conv2DTranspose, Lambda, Input, Reshape, Add, Multiply from tensorflow.python.ker
pytorch中的embedding词向量的使用方法

Embedding 词嵌入在 pytorch 中非常简单,只需要调用 torch.nn.Embedding(m, n) 就可以了,m 表示单词的总数目,n 表示词嵌入的维度,其实词嵌入就相当于是一个大矩阵,矩阵的每一行表示一个单词. emdedding初始化默认是随机初始化的 import torch from torch import nn from torch.autograd import Variable # 定义词嵌入 embeds = nn.Embedding(2, 5) # 2
Tensorflow中k.gradients()和tf.stop_gradient()用法说明

上周在实验室开荒某个代码,看到中间这么一段,对Tensorflow中的stop_gradient()还不熟悉,特此周末进行重新并总结. y = xx + K.stop_gradient(rounded - xx) 这代码最终调用位置在tensoflow.python.ops.gen_array_ops.stop_gradient(input, name=None),关于这段代码为什么这样写的意义在文末给出. [stop_gradient()意义] 用stop_gradient生成损失函数w.r.
keras打印loss对权重的导数方式

Notes 怀疑模型梯度爆炸,想打印模型 loss 对各权重的导数看看.如果如果fit来训练的话,可以用keras.callbacks.TensorBoard实现. 但此次使用train_on_batch来训练的,用K.gradients和K.function实现. Codes 以一份 VAE 代码为例 # -*- coding: utf8 -*- import keras from keras.models import Model from keras.layers import Input
Keras—embedding嵌入层的用法详解

最近在工作中进行了NLP的内容,使用的还是Keras中embedding的词嵌入来做的. Keras中embedding层做一下介绍. 中文文档地址:https://keras.io/zh/layers/embeddings/ 参数如下: 其中参数重点有input_dim,output_dim,非必选参数input_length. 初始化方法参数设置后面会单独总结一下. demo使用预训练(使用百度百科(word2vec)的语料库)参考 embedding使用的demo参考: def creat
基于BootStrap Metronic开发框架经验小结【五】Bootstrap File Input文件上传插件的用法详解

Bootstrap文件上传插件File Input是一个不错的文件上传控件,但是搜索使用到的案例不多,使用的时候,也是一步一个脚印一样摸着石头过河,这个控件在界面呈现上,叫我之前使用过的Uploadify 好看一些,功能也强大些,本文主要基于我自己的框架代码案例,介绍其中文件上传插件File Input的使用. 1.文件上传插件File Input介绍这个插件主页地址是:http://plugins.krajee.com/file-input,可以从这里看到很多Demo的代码展示:http:/
Java中volatile关键字的作用与用法详解

volatile这个关键字可能很多朋友都听说过,或许也都用过.在Java 5之前,它是一个备受争议的关键字,因为在程序中使用它往往会导致出人意料的结果.在Java 5之后,volatile关键字才得以重获生机. volatile 关键字作用是,使系统中所有线程对该关键字修饰的变量共享可见,可以禁止线程的工作内存对volatile修饰的变量进行缓存. volatile 2个使用场景: 1.可见性:Java提供了volatile关键字来保证可见性. 当一个共享变量被volatile修饰时,它会保证修
Java中DecimalFormat用法详解

我们经常要将数字进行格式化,比如取2位小数,这是最常见的.Java 提供DecimalFormat类,帮你用最快的速度将数字格式化为你需要的样子.下面是一个例子: importjava.text.DecimalFormat; public class TestNumberFormat{ public static void main(String[]args){ doublepi=3.1415927; //圆周率 //取一位整数 System.out.println(newDecimalForm
对Django的restful用法详解(自带的增删改查)

什么是rest REST是所有Web应用都应该遵守的架构设计指导原则. Representational State Transfer,翻译是"表现层状态转化". 面向资源是REST最明显的特征,对于同一个资源的一组不同的操作.资源是服务器上一个可命名的抽象概念,资源是以名词为核心来组织的,首先关注的是名词. REST要求,必须通过统一的接口来对资源执行各种操作.对于每个资源只能执行一组有限的操作. GET用来获取资源,POST用来新建资源(也可以用于更新资源),PUT(PATCH)用
PyTorch中反卷积的用法详解

pytorch中的 2D 卷积层和 2D 反卷积层函数分别如下: class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, groups=1, bias=True) class torch.nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, b
pytorch中nn.Conv1d的用法详解

先粘贴一段official guide:nn.conv1d官方我一开始被in_channels.out_channels卡住了很久,结果发现就和conv2d是一毛一样的.话不多说,先粘代码(菜鸡的自我修养) class CNN1d(nn.Module): def __init__(self): super(CNN1d,self).__init__() self.layer1 = nn.Sequential( nn.Conv1d(1,100,2), nn.BatchNorm1d(100), nn
基于pytorch 预训练的词向量用法详解

如何在pytorch中使用word2vec训练好的词向量 torch.nn.Embedding() 这个方法是在pytorch中将词向量和词对应起来的一个方法. 一般情况下,如果我们直接使用下面的这种: self.embedding = torch.nn.Embedding(num_embeddings=vocab_size, embedding_dim=embeding_dim) num_embeddings=vocab_size 表示词汇量的大小 embedding_dim=embeding
C++ getline函数用法详解

虽然可以使用 cin 和 >> 运算符来输入字符串,但它可能会导致一些需要注意的问题. 当 cin 读取数据时,它会传递并忽略任何前导白色空格字符(空格.制表符或换行符).一旦它接触到第一个非空格字符即开始阅读,当它读取到下一个空白字符时,它将停止读取.以下面的语句为例: cin >> namel; 可以输入 "Mark" 或 "Twain",但不能输入 "Mark Twain",因为 cin 不能输入包含嵌入空格的字符串
Java之Pattern.compile函数用法详解

除了Pattern Pattern.compile(String regex), Pattern类的compile()方法还有另一个版本: Pattern Pattern.complie(String regex,int flag),它接受一个标记参数flag,以调整匹配的行为. flag来自以下Pattern类中的常量: 编译标记效果 Pattern.CANON_EQ 两个字符当且仅当它们的完全规范分解相匹配时,就认为它们是匹配的,例如,如果我们指定这个标记,表达式a\u030A就会匹配字符

Keras—embedding嵌入层的用法详解

相关推荐

随机推荐