How to get an index for a certain sentance in python using nltk?

So I have a problem to find sentances containing certain words from text and outputting those sentances with their indexes (I mean sentance number in a text)

Using NLTK library I made my text to separate on sentances and outup certain I need:

Code:

from nltk.tokenize import sent_tokenize, word_tokenize
text = "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum."
search_words = ["Ipsum", "Aldus"]
matches = []
sentances = sent_tokenize(text)
for word in search_words:
    for sentance in sentances:
        if word in sentance:
            matches.append(sentance)
print(matches)

Output

Also using len I got overall sentances' number, But I can't make them output their indexes, when I trying to use .index:

index = sentances.index(matches)
print(index)

I'm getting this

If anybody know how to resolve it?

I've tried to get indexes of certain sentances

Search This Blog

Theprogrammersfirst | A technical portal.

How to get an index for a certain sentance in python using nltk?

Comments

Post a Comment