I want to build a model, that predicts next character based on the previous characters. Here is a architecture of my LSTM model:
embeddings = self.emb(x) # dimension (batch_size,sequence_length,embedding_dimension)
prediction = []
for i in range(sequence_length):
x_t = embeddings[:,i,:]
input_gate calculation
output_gate calculation
forget_gate calculation
cell_gate calculation
cell_t = calculation
hidden_t = calculation
probs = Softmax(h_t @ W + b)
prediction.append(probs)
prediction = torch.stack(prediction,dim = 1)
return prediction
After running my forward loop I get dimension of my predictions: (batch_size, sequence_length, vocabulary_size) which is in my case (128,100,44)
Before that I have created dataset and data loader and defined my targets as follows: dimension is (batch_size, sequence_length) which is in my case (128,100)
but when I calculate my loss:
criterion = nn.CrossEntropyLoss()
pred = model.forward(inputs)
loss = criterion(pred, targets)
I am receiving the following error:
ValueError: Expected target size (128, 44), got torch.Size([128, 100])
44 in this case is vocabulary_size. I don't understand how and why should I convert my target variable into dimension (batch_size,vocabulary_size). Can you explain me how should target variable for sequence prediction look like?
from Recent Questions - Stack Overflow https://ift.tt/3pjSKPw
https://ift.tt/eA8V8J
No comments:
Post a Comment