0

I am learning how to do analysis on a large volume of comments and I asked ChatGPT to generate the code for me.

Then I don't know why my Python can't find some of the modules even though I installed them in my venv. I installed them using pip install pandas nltk scikit-learn gensim . I am pretty sure that they are properly installed as I checked it with pip list

I am using pyhton v.3.11.4

Here is the error message: (data_analysis) D:\> py "D:\New folder\Main Program" Traceback (most recent call last): File "D:\New folder\Main Program", line 1, in <module> import pandas as pd ModuleNotFoundError: No module named 'pandas'

And here is my code:

import pandas as pd
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.sentiment import SentimentIntensityAnalyzer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation
from gensim.summarization import summarize

nltk.download('punkt')
nltk.download('stopwords')
nltk.download('vader_lexicon')

# Load and preprocess data
def preprocess_text(text):
    # Tokenize and remove stopwords
    words = word_tokenize(text)
    words = [word.lower() for word in words if word.isalpha()]
    words = [word for word in words if word not in stopwords.words('english')]
    return ' '.join(words)

# Load your comments data into a DataFrame (assuming 'comments' column)
data = pd.read_csv('comments.csv')
data['cleaned_comment'] = data['comment'].apply(preprocess_text)

# Sentiment Analysis
sia = SentimentIntensityAnalyzer()
data['sentiment_score'] = data['cleaned_comment'].apply(lambda x: sia.polarity_scores(x)['compound'])

# Topic Modeling using LDA
vectorizer = CountVectorizer(max_df=0.8, min_df=2, stop_words='english')
doc_term_matrix = vectorizer.fit_transform(data['cleaned_comment'])
lda_model = LatentDirichletAllocation(n_components=5, random_state=42)
lda_model.fit(doc_term_matrix)
data['topic'] = lda_model.transform(doc_term_matrix).argmax(axis=1)

# Extractive Summarization
data['summary'] = data['comment'].apply(lambda x: summarize(x, ratio=0.3))

# Manual Review and Visualization
for index, row in data.iterrows():
    print(f"Comment {index+1} - Sentiment: {row['sentiment_score']:.2f}, Topic: {row['topic']}")
    print("Original Comment:", row['comment'])
    print("Summary:", row['summary'])
    print("="*50)

# Save summarized data to a new CSV file
data.to_csv('summarized_comments.csv', index=False)

Dyl
  • 1
  • Does this answer your question? [ImportError: No module named pandas](https://stackoverflow.com/questions/33481974/importerror-no-module-named-pandas) – Ada Aug 24 '23 at 09:49
  • 2
    Have you installed it in `data_analysis` environment? – shaik moeed Aug 24 '23 at 09:57
  • 1
    It sounds like you either have not properly installed pandas or you're not properly connected to the right virtual environment. Start by looking up the settings on your code editor and ensure you're hooked up properly. – hkh Aug 24 '23 at 09:57

0 Answers0