Sentiment analysis using NLTK SentimentIntensityAnalyzer and SentiWordNet Lexicon

Get dependencies:

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')

Analyze text:

import nltk
from nltk.corpus import sentiwordnet as swn
from nltk.sentiment import SentimentIntensityAnalyzer

nltk.download('sentiwordnet')

def map_sentiment(word, pos):
    try:
        senti_synsets = list(swn.senti_synsets(word, pos))
        if len(senti_synsets) > 0:
            sentiment = sum(
                [senti_synset.pos_score() - senti_synset.neg_score() for senti_synset in senti_synsets]
            ) / len(senti_synsets)
            if sentiment > 0:
                return 'pos'
            elif sentiment < 0:
                return 'neg'
    except KeyError:
        pass
    return 'neu'

def analyze_sentiment(text):
    sia = SentimentIntensityAnalyzer()
    sentiment_scores = sia.polarity_scores(text)
    tokenized_text = nltk.word_tokenize(text)
    tagged_text = nltk.pos_tag(tokenized_text)
    sentiment_scores_swn = [
        map_sentiment(token, pos) for token, pos in tagged_text
    ]
    sentiment_scores_combined = {
        'pos': sentiment_scores['pos'] + sentiment_scores_swn.count('pos'),
        'neg': sentiment_scores['neg'] + sentiment_scores_swn.count('neg'),
        'neu': sentiment_scores['neu'] + sentiment_scores_swn.count('neu'),
        'compound': sentiment_scores['compound']
    }
    return sentiment_scores_combined

text = "This is a great day!"
scores = analyze_sentiment(text)
print(scores)

Output:

{'pos': 0.594, 'neg': 0.0, 'neu': 6.406, 'compound': 0.6588}

Conclusion:

The above code provides sentiment scores for the given text "This is a great day!" using both the SentimentIntensityAnalyzer and SentiWordNet. The output sentiment scores are as follows:

  • Positive sentiment score (pos): 0.594

  • Negative sentiment score (neg): 0.0

  • Neutral sentiment score (neu): 6.406

  • Compound sentiment score (compound): 0.6588 The positive score indicates a moderately positive sentiment in the text, while the negative score is 0, suggesting no negative sentiment. The neutral score is relatively high, which means there are many words classified as neutral in the text. The compound score of 0.6588 represents an overall positive sentiment.

Please note that sentiment analysis is not always perfect and can vary depending on the context and the specific algorithms or lexicons used.