Stop words with NLTK

Posted by WRW on February 8, 2019

Stop words with NLTK

from nltk.corpus import stopwords from nltk.tokenize import word_tokenize

example_sent = “This is a sample sentence, showing off the stop words filtration.”

stop_words = set(stopwords.words(‘english’))

word_tokens = word_tokenize(example_sent)

filtered_sentence = [w for w in word_tokens if not w.lower() in stop_words]

print(filtered_sentence)

输出为

[‘sample’, ‘sentence’, ‘,’, ‘showing’, ‘stop’, ‘words’, ‘filtration’, ‘.’]