DataScience With Python/R/SAS: Sentimental Analysis using Natural Language Processing | Machine Learning

Lets us explorer and analysis the nlkt module. Getting and preparing the data.
In this example we are going to train two models to classify SMS as "Spam" or "Ham".

Lets import relevant modules and load the tab separated file in pandas dataframe, the dataset has 5572 observations and 2 features. we would add 1 more features based on the classification type(label)

Fit the features as X and y as shown above, now we would split the set as training and testing sets using cross validation and then vectorizing to getting the features.

We will use multinomial Naive Bayes classifier, as this is suitable for classification with **discrete features** (e.g., word counts for text classification). The multinomial distribution normally requires integer feature counts. However, in practice, fractional counts such as tf-idf may also work.

Lets now predict the class for the following sms,

DataScience With Python/R/SAS

Easy Pages

Sentimental Analysis using Natural Language Processing | Machine Learning | Document Classification

No comments:

Post a Comment