2020-12-27

Shape of target for LSTM model(Pytorch, from scratch)

I want to build a model, that predicts next character based on the previous characters. Here is a architecture of my LSTM model:

    embeddings = self.emb(x) # dimension (batch_size,sequence_length,embedding_dimension)
    prediction = []

    for i in range(sequence_length): 

        x_t = embeddings[:,i,:]

        
        input_gate calculation
        output_gate calculation
        forget_gate calculation
        cell_gate calculation
        
        cell_t = calculation
        hidden_t = calculation
        
        probs = Softmax(h_t @ W + b)
        prediction.append(probs)
    prediction = torch.stack(prediction,dim = 1)
    return prediction

After running my forward loop I get dimension of my predictions: (batch_size, sequence_length, vocabulary_size) which is in my case (128,100,44)

Before that I have created dataset and data loader and defined my targets as follows: dimension is (batch_size, sequence_length) which is in my case (128,100)

but when I calculate my loss:

criterion = nn.CrossEntropyLoss()
pred = model.forward(inputs) 
loss = criterion(pred, targets)

I am receiving the following error:

ValueError: Expected target size (128, 44), got torch.Size([128, 100])

44 in this case is vocabulary_size. I don't understand how and why should I convert my target variable into dimension (batch_size,vocabulary_size). Can you explain me how should target variable for sequence prediction look like?



from Recent Questions - Stack Overflow https://ift.tt/3pjSKPw
https://ift.tt/eA8V8J

No comments:

Post a Comment