就是个基类
需要自己另外写函数,得到所有的span_indices,不过也简单。
Many NLP models deal with representations of spans inside a sentence. SpanExtractors define methods for extracting and representing spans from a sentence.
许多NLP模型处理句子中的span表示。SpanExtractors定义了从句子中提取和表示span的方法。
SpanExtractors take a sequence tensor of shape (batch_size, timesteps, embedding_dim) and indices of shape (batch_size, num_spans, 2) and return a tensor of shape (batch_size, num_spans, …), forming some representation of the spans.
SpanExtractors接受的序列张量的形状(batch_size, timesteps, embedding_dim)和目录的形状(batch_size, num_span, 2),并返回形状为(batch_size, num_span,…)的张量,形成span的一些表示。timesteps就是句子长度 2是什么,是pair(start,end)
forward(self, sequence_tensor: torch.FloatTensor, span_indices: torch.LongTensor, sequence_mask: torch.LongTensor = None, span_indices_mask: torch.LongTensor = None)
Given a sequence tensor, extract spans and return representations of them. Span representation can be computed in many different ways, such as concatenation of the start and end spans, attention over the vectors contained inside the span
给定一个序列张量,提取span并返回它们的表示。span表示可以通过许多不同的方式来计算,例如开始span和结束span,包含span内部的vector的attention,等等。
参数如下:
sequence_tensortorch.FloatTensor, required.
A tensor of shape (batch_size, sequence_length, embedding_size) representing an embedded sequence of words. sentence的表示
span_indicestorch.LongTensor, required.
A tensor of shape (batch_size, num_spans, 2), where the last dimension represents the inclusive start and end indices of the span to be extracted from the sequence_tensor. 2对应的span在原序列start,end的位置。
sequence_mask torch.LongTensor, optional (default = None).
A tensor of shape (batch_size, sequence_length) representing padded elements of the sequence.
span_indices_mask torch.LongTensor, optional (default = None).
A tensor of shape (batch_size, num_spans) representing the valid spans in the indices tensor. This mask is optional because sometimes it’s easier to worry about masking after calling this function, rather than passing a mask directly.
返回值:
A tensor of shape (batch_size, num_spans, embedded_span_size),where embedded_span_size depends on the way spans are represented.
两个内置函数
get_input_dim(self) → int[source]
Returns the expected final dimension of the sequence_tensor.
get_output_dim(self) → int[source]
Returns the expected final dimension of the returned span representation.
不用注意力加权就用这个
allennlp.modules.span_extractors.endpoint_span_extractor.EndpointSpanExtractor(input_dim: int, combination: str = ‘x, y’, num_width_embeddings: int = None, span_width_embedding_dim: int = None, bucket_widths: bool = False, use_exclusive_start_indices: bool = False)
将span表示为其端点的enbedding的组合。此外,span的宽度可以embbed并连接到最终的组合。
支持以下表示类型,假设x = span_start_embeddings和y = span_end_embeddings。
x, y, X y, X +y, X -y, X /y,每一个二进制操作都以elementwise方式执行。您可以列出任何您想要的组合,逗号分隔。例如,您可以将x、y、xy作为该类的组合参数。计算出的相似度函数将是[x;y;X *y],然后可以有选择地将其与span宽度的embedding表示连接起来。
参数说明
input_dimint, required.
The final dimension of the sequence_tensor. 维度的大小,input.shape[-1]
combination str, optional (default = “x,y”).
The method used to combine the start_embedding and end_embedding representations. See above for a full description.
num_width_embeddings int, optional (default = None).
Specifies the number of buckets to use when representing span width features. span的最大长度
span_width_embedding_dimint, optional (default = None).
The embedding size for the span_width features. 想把width embed成多少维
use_exclusive_start_indicesbool, optional (default = False).
If True, the start indices extracted are converted to exclusive indices. Sentinels are used to represent exclusive span indices for the elements in the first position in the sequence (as the exclusive indices for these elements are outside of the the sequence boundary) so that start indices can be exclusive. NOTE: This option can be helpful to avoid the pathological case in which you want span differences for length 1 spans - if you use inclusive indices, you will end up with an x - x operation for length 1 spans, which is not good. 想让span的头不重复就设置成true
bucket_widths bool, optional (default = False).
Whether to bucket the span widths into log-space buckets. If False, the raw span widths are used.
from allennlp.modules.span_extractors import EndpointSpanExtractor
#模型实例化,对应init函数
es=EndpointSpanExtractor(seq.shape[-1], "x,y,x*y")
#使用模型,对应forward函数
output=es(torch.FloatTensor(seq),torch.LongTensor(indice))
print(output.shape)
输出:torch.Size([5, 3, 30])
最后一个维度是input_dim*combination的长度(10*3=30)
想注意力加权就用这个,就是要加个注意力
allennlp.modules.span_extractors.self_attentive_span_extractor.SelfAttentiveSpanExtractor(input_dim: int)
通过为文档中的每个单词生成非标准化的注意力分数来计算span表示。span表示是通过规范化span内单词的attention分数来计算这些分数的。
给定每个跨度上的这些注意力分布,这个模块用这个分布对跨度中单词的相应向量表示进行加权,返回每个跨度的加权表示。
给定一个序列张量,提取span并返回它们的表示。空间表示可以通过许多不同的方式来计算,例如连接开始和结束空间,关注包含在空间内的向量,等等。