Iterating a python script through each rows in pandas

Question

I am having a python script for sending twitter alerts through slack :-

def twitter_setup():
    """
    Utility function to setup the Twitter's API
    with our access keys provided.
    """
    # Authentication and access using keys:
    auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
    auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)

    # Return API with authentication:
    api = tweepy.API(auth)
    return api


extractor = twitter_setup()
# We create a tweet list as follows:
tweets = extractor.user_timeline(screen_name="**FortniteGame**", count=200)


data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

# We add relevant data:
data['ID'] = np.array([tweet.id for tweet in tweets])
data['Date'] = np.array([tweet.created_at for tweet in tweets])
data['text'] = np.array([tweet.text for tweet in tweets])
#data['Date'] = pd.to_datetime(data['Date'], unit='ms').dt.tz_localize('UTC').dt.tz_convert('US/Eastern')

created_time = datetime.datetime.utcnow() - datetime.timedelta(minutes=1)

data = data[(data['Date'] > created_time) & (
    data['Date'] < datetime.datetime.utcnow())]

my_list = ['Maintenance', 'Scheduled', 'downtime', 'Issue', 'Voice', 'Happy',
           'Problem', 'Outage', 'Service', 'Interruption', 'voice-comms', 'Downtime']

ndata = data[data['Tweets'].str.contains(
    "|".join(my_list), regex=True)].reset_index(drop=True)


slack = Slacker('xoxb-3434-4334-fgsgsdfsf')

#message = "test message"
slack.chat.post_message('#ops-twitter-alerts', 'FNWP :' +' '+ ndata['Tweets'] + '<!channel|>')

Now I am having a csv file which i am reading in pandas like below

       client domain twittername
1.)    EPIC   FNWP   FortniteGame
2.)    PUBG   BLHP   PUBG
3.)    abc    xyx    abhi98358

I want to use the same script for each client and i want to iterate through it and suppose first it will do for Fortnite and then for PUBG and then for abhi98358 and in the same way it should go step by step.

use iterrows or for loop would be sufficient to go through the dataframe — iamklaus, Oct 18 '18 at 12:47
hey @SarthakNegi i am new in python... I am good with scripting but kind of not good in loops and all. here two variables i want to make dynamic one is twitter name and one , 'FNWP :' + client name. if you can provide some sample code. — , Oct 18 '18 at 12:50
https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas — iamklaus, Oct 18 '18 at 12:51
Possible duplicate of [How to iterate over rows in a DataFrame in Pandas?](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas) — Georgy, Oct 18 '18 at 12:52
@Georgy I saw this example but i am kind of getting stuck how to pass the client name and twitter name to two different variables. I am messing up at some point. — , Oct 18 '18 at 12:54
@ak333: I can see you are only using column`twitername` on this line of code `tweets = extractor.user_timeline(screen_name="**FortniteGame**", count=200)`. So you just want to use iterate on this instead of hardcoding `FortniteGame`?. Correct me if I am wrong? — Rahul Agarwal, Oct 18 '18 at 13:06
@RahulAgarwal that is exactly what i want.... and on the last line i just want to put domain name instead of putting FNWP :- , 'FNWP :' +' '+ ndata['Tweets'] . these two things i wwant to iterate over. — , Oct 18 '18 at 13:08

Rahul Agarwal · Answer 1 · 2018-10-18T15:30:14.957

0

Sample df:

t = pd.DataFrame({'A': ['FortniteGame', 'PUBG', 'abhi98358']})

Sample Iteration:

for index, row in t.iterrows():
   print "**" + row['A'] +"**"

Sample Output for above:

**FortniteGame**
**PUBG**
**abhi98358**

For your code:

for index,rows in df.iterows():
   tweets = extractor.user_timeline(screen_name=("**" + row['twittername'] +"**"), count=200)

edited Oct 18 '18 at 15:30

answered Oct 18 '18 at 13:17

Rahul Agarwal

4,034
7
27
51

i am gonna try now – Oct 18 '18 at 13:53
sure thanks a lot i am trying if i wont b able to i will post again – Oct 18 '18 at 14:03
If it solves what you are looking, do accept and upvote for future users!! – Rahul Agarwal Oct 18 '18 at 14:05
so its working with your example but when i am trying on mine its saying AttributeError: 'DataFrame' object has no attribute 'iterows' ............also i don't want to get tweets for all teh user at the same time i want that it should complete the script for one user and then go on the second row and so on, – Oct 18 '18 at 14:08
still the same issue ...below i have written for index,rows in t.iterows: tweets = extractor.user_timeline(screen_name=(row.twittername ), count=200) data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets']) # We add relevant data: data['ID'] = np.array([tweet.id for tweet in tweets]) data['Date'] = np.array([tweet.created_at for tweet in tweets]) data['text'] = np.array([tweet.text for tweet in tweets]) – Oct 18 '18 at 14:19
t is the name of my dataframe here – Oct 18 '18 at 14:27
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/182098/discussion-between-rahul-agarwal-and-ak333). – Rahul Agarwal Oct 18 '18 at 15:08

score 0 · Answer 2 · answered Oct 18 '18 at 13:20

0

You can do like below

twtLst = []
for _, row in df.iterrows():
   twtDtl = row['client'], row['domain'], row['twittername']
   twtLst.append(twtDtl)

twtLst will be a List of tuple and then you access it accordingly like below

for twt in twtLst:
    client, domain, twtname = twt
    tweets = extractor.user_timeline(screen_name="**" + twtname +"**", count=200)
    #message = "test message"
    slack.chat.post_message('#ops-twitter-alerts', domain + ':' + client +' '+ndata['Tweets'] + '<!channel|>')

answered Oct 18 '18 at 13:20

ansu5555

416
2
7

in the second chunk i just need the screen name and nothing else so i should do soemthign like tweets = extractor.user_timeline(screen_name="**" , count=200) . right? – Oct 18 '18 at 13:26
Yes correct, `"**" + twtname +"**"` is just concatenation of 3 strings – ansu5555 Oct 18 '18 at 13:34
its giving me list of tweets for both the clients at the same time. – Oct 18 '18 at 13:57
i want that it should do the same process for each of them one by one and not together. – Oct 18 '18 at 14:04

score 0 · Accepted Answer · answered Oct 18 '18 at 18:37

Here you go

refer to my solution converting a python script into a function to iterate over each row

for index, row in dff.iterrows():
    twt=row['twittername']
    domain = row['domain']
    print(twt)
    print(domain)
    extractor = twitter_setup()
    # We create a tweet list as follows:
    tweets = extractor.user_timeline(screen_name=twt, count=200)
    data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

    # We add relevant data:
    data['ID'] = np.array([tweet.id for tweet in tweets])
    data['Date'] = np.array([tweet.created_at for tweet in tweets])
    data['text'] = np.array([tweet.text for tweet in tweets])
    #data['Date'] = pd.to_datetime(data['Date'], unit='ms').dt.tz_localize('UTC').dt.tz_convert('US/Eastern')

    created_time = datetime.datetime.utcnow() - datetime.timedelta(minutes=160)

    data = data[(data['Date'] > created_time) & (data['Date'] < datetime.datetime.utcnow())]

    my_list = ['Maintenance', 'Scheduled', 'downtime', 'Issue', 'Voice', 'Happy','hound',
               'Problem', 'Outage', 'Service', 'Interruption', 'ready','voice-comms', 'Downtime','Patch']

    ndata = data[data['Tweets'].str.contains( "|".join(my_list), regex=True)].reset_index(drop=True)

    print(ndata)
    if len(ndata['Tweets'])> 0:
        slack.chat.post_message('#ops-twitter-alerts', domain  +': '+ ndata['Tweets'] + '<!channel|>')
    else:
        print('hi')

Iterating a python script through each rows in pandas

3 Answers3