我想将ptb_word_lm.py示例中的损失函数更改为tf.nn.nce_loss
。查看tf.nn.nce_loss
实现:
def nce_loss(weights, biases, inputs, labels, num_sampled, num_classes,
num_true=1,
sampled_values=None,
remove_accidental_hits=False,
partition_strategy="mod",
name="nce_loss"):
我想啊
但我不知道前两个参数是什么,权重和偏差。我如何适应tf。nn。nce\u损失
到语言模型?谢谢
@亚伦:
谢谢,我尝试了以下方法:
loss = tf.reduce_mean(
tf.nn.nce_loss(softmax_w, softmax_b, logits, tf.reshape(self._targets, [-1,1]),
64, vocab_size))
根据这里的文件:
>
偏差:形状的张量[num_类]。阶级偏见。
输入:形状张量[批次大小,尺寸]。输入网络的正向激活。
标签:int64类型的张量和形状[batch\u size,num\u true]。目标类。
num_sampled:一个整数。每批随机抽样的类数。
num_classes:一个int。可能的类的数量。
所以
我的PTBModel模型看起来像
class PTBModel(object):
def __init__(self, is_training, config):
self.batch_size = batch_size = config.batch_size
self.num_steps = num_steps = config.num_steps
size = config.hidden_size
vocab_size = config.vocab_size
self._input_data = tf.placeholder(tf.int32, [batch_size, num_steps])
self._targets = tf.placeholder(tf.int32, [batch_size, num_steps])
lstm_cell = rnn_cell.BasicLSTMCell(size, forget_bias=0.0)
if is_training and config.keep_prob < 1:
lstm_cell = rnn_cell.DropoutWrapper(lstm_cell, output_keep_prob=config.keep_prob)
cell = rnn_cell.MultiRNNCell([lstm_cell] * config.num_layers)
self._initial_state = cell.zero_state(batch_size, tf.float32)
with tf.device("/cpu:0"):
embedding = tf.get_variable("embedding", [vocab_size, size])
inputs = tf.nn.embedding_lookup(embedding, self._input_data)
if is_training and config.keep_prob < 1:
inputs = tf.nn.dropout(inputs, config.keep_prob)
outputs = []
states = []
state = self._initial_state
with tf.variable_scope("RNN"):
for time_step in range(num_steps):
if time_step > 0: tf.get_variable_scope().reuse_variables()
(cell_output, state) = cell(inputs[:, time_step, :], state)
outputs.append(cell_output)
states.append(state)
output = tf.reshape(tf.concat(1, outputs), [-1, size])
softmax_w = tf.get_variable("softmax_w", [size, vocab_size])
softmax_b = tf.get_variable("softmax_b", [vocab_size])
logits = tf.matmul(output, softmax_w) + softmax_b
'''
#minimize the average negative log probability using sequence_loss_by_example
loss = seq2seq.sequence_loss_by_example([logits],
[tf.reshape(self._targets, [-1])],
[tf.ones([batch_size * num_steps])],
vocab_size)
loss = tf.reduce_mean(
tf.nn.nce_loss(nce_weights, nce_biases, embed, train_labels,
num_sampled, vocabulary_size))
weights: A Tensor of shape [num_classes, dim], or a list of Tensor objects
whose concatenation along dimension 0 has shape [num_classes, dim]. The (possibly-partitioned) class embeddings.
biases: A Tensor of shape [num_classes]. The class biases.
inputs: A Tensor of shape [batch_size, dim]. The forward activations of the input network.
labels: A Tensor of type int64 and shape [batch_size, num_true]. The target classes.
num_sampled: An int. The number of classes to randomly sample per batch.
num_classes: An int. The number of possible classes.
'''
loss = tf.reduce_mean(
tf.nn.nce_loss(softmax_w, softmax_b, logits, tf.reshape(self._targets, [-1,1]),
64, vocab_size))
self._cost = cost = tf.reduce_sum(loss) / batch_size
self._final_state = states[-1]
if not is_training:
return
self._lr = tf.Variable(0.0, trainable=False)
tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars),
config.max_grad_norm)
optimizer = tf.train.GradientDescentOptimizer(self.lr)
self._train_op = optimizer.apply_gradients(zip(grads, tvars))
然而,我得到了一个错误
Epoch: 1 Learning rate: 1.000
W tensorflow/core/common_runtime/executor.cc:1102] 0x528c980 Compute status: Invalid argument: Index 9971 at offset 0 in Tindices is out of range
[[Node: model/nce_loss/embedding_lookup = Gather[Tindices=DT_INT64, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](model/softmax_w/read, model/nce_loss/concat)]]
W tensorflow/core/common_runtime/executor.cc:1102] 0x528c980 Compute status: Invalid argument: Index 9971 at offset 0 in Tindices is out of range
[[Node: model/nce_loss/embedding_lookup = Gather[Tindices=DT_INT64, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](model/softmax_w/read, model/nce_loss/concat)]]
[[Node: _send_model/RNN/concat_19_0 = _Send[T=DT_FLOAT, client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1438650956868917036, tensor_name="model/RNN/concat_19:0", _device="/job:localhost/replica:0/task:0/cpu:0"](model/RNN/concat_19)]]
Traceback (most recent call last):
File "/home/user/works/workspace/python/ptb_word_lm/ptb_word_lm.py", line 235, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/default/_app.py", line 30, in run
sys.exit(main(sys.argv))
File "/home/user/works/workspace/python/ptb_word_lm/ptb_word_lm.py", line 225, in main
verbose=True)
File "/home/user/works/workspace/python/ptb_word_lm/ptb_word_lm.py", line 189, in run_epoch
m.initial_state: state})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 315, in run
return self._run(None, fetches, feed_dict)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 511, in _run
feed_dict_string)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 564, in _do_run
target_list)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 586, in _do_call
e.code)
tensorflow.python.framework.errors.InvalidArgumentError: Index 9971 at offset 0 in Tindices is out of range
[[Node: model/nce_loss/embedding_lookup = Gather[Tindices=DT_INT64, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](model/softmax_w/read, model/nce_loss/concat)]]
Caused by op u'model/nce_loss/embedding_lookup', defined at:
File "/home/user/works/workspace/python/ptb_word_lm/ptb_word_lm.py", line 235, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/default/_app.py", line 30, in run
sys.exit(main(sys.argv))
File "/home/user/works/workspace/python/ptb_word_lm/ptb_word_lm.py", line 214, in main
m = PTBModel(is_training=True, config=config)
File "/home/user/works/workspace/python/ptb_word_lm/ptb_word_lm.py", line 122, in __init__
64, vocab_size))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn.py", line 798, in nce_loss
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn.py", line 660, in _compute_sampled_logits
weights, all_ids, partition_strategy=partition_strategy)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 86, in embedding_lookup
validate_indices=validate_indices)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 447, in gather
validate_indices=validate_indices, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 655, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2040, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1087, in __init__
self._traceback = _extract_stack()
我错过什么了吗?再次感谢。
权重和偏差是语言模型输出层的权重矩阵和偏差向量。
https://www.tensorflow.org/versions/r0.8/api_docs/python/nn.html#nce_loss
GaussianNoise层 因为这是一个起正则化作用的层,该层只在训练时才有效。 GaussianDropout层 因为这是一个起正则化作用的层,该层只在训练时才有效。 AlphaDropout Alpha Dropout是一种保持输入均值和方差不变的Dropout,该层的作用是即使在dropout时也保持数据的自规范性。 通过随机对负的饱和值进行激活,Alphe Drpout与selu激活函数
GaussianNoise层 keras.layers.noise.GaussianNoise(sigma) 为层的输入施加0均值,标准差为sigma的加性高斯噪声。该层在克服过拟合时比较有用,你可以将它看作是随机的数据提升。高斯噪声是需要对输入数据进行破坏时的自然选择。 一个使用噪声层的典型案例是构建去噪自动编码器,即Denoising AutoEncoder(DAE)。该编码器试图从加噪的输
GaussianNoise层 keras.layers.noise.GaussianNoise(stddev) 为数据施加0均值,标准差为stddev的加性高斯噪声。该层在克服过拟合时比较有用,你可以将它看作是随机的数据提升。高斯噪声是需要对输入数据进行破坏时的自然选择。 因为这是一个起正则化作用的层,该层只在训练时才有效。 参数 stddev:浮点数,代表要产生的高斯噪声标准差 输入shape
声音语言 选择影像的声音语言。
C语言设计模式 关于软件设计方面的书很多,比如《重构》,比如《设计模式》。至于软件开发方式,那就更多了,什么极限编程、精益方法、敏捷方法。随着时间的推移,很多的方法又会被重新提出来。 其实,就我个人看来,不管什么方法都离不开人。一个人写不出二叉树,你怎么让他写?敏捷吗?你写一行,我写一行。还是迭代?写三行,删掉两行,再写三行。项目的成功是偶然的,但是项目的失败却有很多原因,管理混乱、需求混乱、设计
我正在写一些函数来添加位图上的噪声效果。我发现了类似的问题:在绘图中添加噪波效果 位图输出位图=位图。创建位图(bitmap.getWidth(),位图。getHeight(),位图。配置。ARGB_8888); 我应该如何添加滤色器才能得到这样的结果?你能提供一些代码吗?