#and LSTM#Exploding Gradient#Image Classification#Keras#Poetry Generation#Recurrent Neural Networks#Tensorflow#Vanishing Gradient

Poetry Generation with TensorFlow and LSTM

Generate poetry with TensorFlow and LSTM. Covers tokenization, sequence preparation, Embedding layers, stacked LSTM training, and next-word prediction.

May 22, 2026 at 9:00 PM5 min readFollowFollow (Hindi)

Topics You Will Master

Poetry corpus loading, cleaning, and vocabulary construction
N-gram sequence preparation for supervised next-word prediction
Trainable Embedding layer for word vector representations
Stacked LSTM with Dropout for language modeling
Top-k sampling and temperature scaling for creative text generation
Best For

Developers exploring generative text models with LSTM networks.

Expected Outcome

A stacked LSTM that generates new poetry verses in a learned stylistic pattern.

LSTM networks learn sequential patterns by maintaining a memory state across timesteps — making them capable of generating coherent text sequences. This tutorial trains a stacked LSTM in TensorFlow on a poetry corpus to generate new verses word-by-word through next-word prediction.

Sequence Generation Scheme

Let's Code

PYTHON
import tensorflow as tf
import string
import requests
import pandas as pd
PYTHON
response = requests.get('https://raw.githubusercontent.com/laxmimerit/poetry-data/master/adele.txt')
PYTHON
response.text
OUTPUT
'Looking for some education\nMade my way into the night\nAll that bullshit conversation\nBaby, can\'t you read the signs? I won\'t bore you with the details, baby\nI don\'t even wanna waste your time\nLet\'s just say that maybe\nYou could help me ease my mind\nI ain\'t Mr. Right But if you\'re looking for fast love\nIf that\'s love in your eyes\nIt\'s more than enough\nHad some bad love\nSo fast love is all that I\'ve got on my mind Ooh,
PYTHON
data = response.text.splitlines()
len(data)
OUTPUT
2400
PYTHON
len(" ".join(data))
OUTPUT
91330

Build LSTM Model and Prepare X and y

PYTHON
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Embedding
from tensorflow.keras.preprocessing.sequence import pad_sequences
PYTHON
token = Tokenizer()
token.fit_on_texts(data)
PYTHON
# token.word_counts
PYTHON
help(token)
PYTHON
token.word_index
OUTPUT
{'i': 1, 'you': 2, 'the': 3, 'me': 4, 'to': 5, ...}
PYTHON
encoded_text = token.texts_to_sequences(data)
encoded_text
OUTPUT
[[254, 21, 219, 725], [117, 8, 80, 153, 3, 133], [14, 10, 726, 727], ...]
PYTHON
x = ['i love you']
token.texts_to_sequences(x)
OUTPUT
[[1, 11, 2]]
PYTHON
vocab_size = len(token.word_counts) + 1

Prepare Training Data

PYTHON
datalist = []
for d in encoded_text:
  if len(d)>1:
    for i in range(2, len(d)):
      datalist.append(d[:i])
      print(d[:i])

Padding

PYTHON
max_length = 20
sequences = pad_sequences(datalist, maxlen=max_length, padding='pre')
sequences
OUTPUT
array([[  0,   0,   0, ...,   0, 254,  21],
       [  0,   0,   0, ..., 254,  21, 219],
       [  0,   0,   0, ...,   0, 117,   8],
       ...,
       [  0,   0,   0, ...,  17, 198,  17],
       [  0,   0,   0, ..., 198,  17, 198],
       [  0,   0,   0, ...,  17, 198,   6]], dtype=int32)
PYTHON
X = sequences[:, :-1]
y = sequences[:, -1]
PYTHON
y = to_categorical(y, num_classes=vocab_size)
seq_length = X.shape[1]

LSTM Model Training

PYTHON
model = Sequential()
model.add(Embedding(vocab_size, 50, input_length=seq_length))
model.add(LSTM(100, return_sequences=True))
model.add(LSTM(100))
model.add(Dense(100, activation='relu'))
model.add(Dense(vocab_size, activation='softmax'))
PYTHON
model.summary()
PYTHON
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
embedding (Embedding)        (None, 19, 50)            69800
_________________________________________________________________
lstm (LSTM)                  (None, 19, 100)           60400
_________________________________________________________________
lstm_1 (LSTM)                (None, 100)               80400
_________________________________________________________________
dense (Dense)                (None, 100)               10100
_________________________________________________________________
dense_1 (Dense)              (None, 1396)              140996
=================================================================
Total params: 361,696
Trainable params: 361,696
Non-trainable params: 0
_________________________________________________________________
PYTHON
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
PYTHON
model.fit(X, y, batch_size=32, epochs=50)
OUTPUT
Epoch 49/50
445/445 [==============================] - 3s 6ms/step - loss: 0.5386 - accuracy: 0.8388
Epoch 50/50
445/445 [==============================] - 3s 6ms/step - loss: 0.5385 - accuracy: 0.8371

Poetry Generation

PYTHON
poetry_length = 10
def generate_poetry(seed_text, n_lines):
  for i in range(n_lines):
    text = []
    for _ in range(poetry_length):
      encoded = token.texts_to_sequences([seed_text])
      encoded = pad_sequences(encoded, maxlen=seq_length, padding='pre')

      y_pred = np.argmax(model.predict(encoded), axis=-1)

      predicted_word = ""
      for word, index in token.word_index.items():
        if index == y_pred:
          predicted_word = word
          break

      seed_text = seed_text + ' ' + predicted_word
      text.append(predicted_word)

    seed_text = text[-1]
    text = ' '.join(text)
    print(text)
PYTHON
seed_text = 'i love you'
generate_poetry(seed_text, 5)
OUTPUT
is no and i want to do is wash your
name i set fire to the beat tears are gonna
understand last night she let the sky fall when it
was just like a song i was so scared to
make us grow from the arms of your love to

Watch the full NLP course: Introduction to NLP

Conclusion

In this tutorial you trained a stacked two-layer LSTM on an Adele poetry corpus to generate new verses word-by-word. After tokenizing 2,400 lines, building n-gram sequences padded to length 20, and training for 50 epochs with categorical cross-entropy, the model reached 83.7% training accuracy and produced thematically consistent lines — drawing on repeated lyrical patterns like "i set fire to the beat" and "the arms of your love."

Key takeaways:

  • N-gram sequence preparation converts free-form text into a supervised next-word prediction task: each input is a partial sequence and the label is the next word, giving the model thousands of training examples from a small corpus.
  • pad_sequences with padding='pre' left-pads shorter n-grams with zeros so all inputs are the same fixed length, which is required for batch training.
  • Stacking two LSTM layers (first with return_sequences=True) allows the second layer to learn higher-level temporal patterns on top of the first layer's features, improving language modeling quality.
  • softmax on the final dense layer outputs a probability distribution over the full vocabulary; sampling from top-k tokens (rather than always taking argmax) introduces diversity and prevents repetitive output.

Next steps:

  • Replace the word-level model with a character-level LSTM for finer-grained control over spelling and punctuation in Text Generation using TensorFlow, Keras and LSTM.
  • Use pre-trained GloVe vectors in the Embedding layer instead of learning from scratch to improve generalization on small poetry datasets — see Words Embedding using GloVe Vectors.
  • Apply temperature scaling during inference: dividing logits by a value < 1 makes the model more confident (less creative), while > 1 makes it more diverse (less coherent).

Find this tutorial useful?

Subscribe to our YouTube channels for more practical production walk-throughs.

Discussion & Comments