I did a example about web classifiation with Naive Bayes library (Python) and Thats working perfectly (classfy web pages very well).
Actually I have 2 questions. Firstly,
I'm using only content of webpage (article side). Thats no problem but, I want that integrate title with double weighted effect to output. I can retrieve title of the page which variable list name is titles[]. Thats my codes for classfy :
x_train = vectorizer.fit_transform(temizdata)
classifer.fit(x_train, y_train)
I can add title to article text, but this time article text and title have same weight.
in codes, temizdata
is my list which keep article text of web pages. and y_train
is classes. How can I do integrate titles[] to classification with double weighted ?
I used Countvectorizer for vectorize , and Naive Bayes MultinominalNB classifier.
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
vectorizer = CountVectorizer()
classifer = MultinomialNB(alpha=.01)