0

im using sqlite3 to store data from a pandas dataframe

i Scrape data from Twitter and i want it to be every hour

and to do that i want to filter the date in the database the date format is like this :

 2021-11-11 08:07:33+00:00

the query im using :

cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at > "+li+" ")

li is a variable last inserted date before filling the db again

cur.execute("SELECT tweet_created_at FROM tweets_b_db ORDER BY tweet_created_at DESC LIMIT 1")
    li = cur.fetchone()

What it returns :

can only concatenate str (not "tuple") to str

my code :

import tweepy
import time
import datetime
import pandas as pd
import sqlite3

con = sqlite3.connect('tweetScaping.db')
cur = con.cursor()
consumer_key = "**********************"
consumer_secret = "****"
access_token = "****-*****"
access_token_secret = "***************"
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)

# using tweepy to search for the keyword Bouygues telecom
text_query = 'bouygues telecom OR @bouyguestelecom OR #Bouygues'
count = 200




try:
    # Creation of query method using parameters
    tweets = tweepy.Cursor(api.search_tweets, q=text_query + " -filter:retweets").items(count)

    # Pulling information from tweets iterable object
    tweets_list = [[tweet.created_at, tweet.id, tweet.text] for tweet in tweets]

    # Creation of dataframe from tweets list
    # Add or remove columns as you remove tweet information
    # tweets_df = pd.DataFrame(columns=['tweet_created_at', 'tweet_id', 'tweet_text'])
    tweets_df = pd.DataFrame(tweets_list)
    tweets_df.columns = ['tweet_created_at', 'tweet_id', 'tweet_text']


    #last inserted
    cur.execute("SELECT tweet_created_at FROM tweets_b_db ORDER BY tweet_created_at DESC LIMIT 1")
    li = cur.fetchone()


    # to insert results to database (sqlite3)
    tweets_df.to_sql(name='tweets_b_db', con=con, if_exists='replace')


    # to show table content
    cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at > "+li+" ")
    print(cur.fetchall())

    

except BaseException as e:
    print('failed on_status,', str(e))
    time.sleep(3) 

Update : using :

cur.execute("SELECT tweet_created_at FROM tweets_b_db ORDER BY tweet_created_at DESC LIMIT 1")
data = cur.fetchone()
data = data[0]
cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at >= Datetime('{data}')")
print(cur.fetchall())

returns nothing :

[]

would be very helpful if someone can guide me to the right direction

ossama assaghir
  • 298
  • 2
  • 16

2 Answers2

1

fetchone() returns a tuple that contains all of your requested columns in the query. in your case, there's a single column (tweet_created_at), so a single element (value of tweet_created_at) will be present in your tuple, which can be accessed at index 0.

li = cur.fetchone()
li = li[0]
Ammar Aslam
  • 560
  • 2
  • 16
  • oh thank you the variable is returning now "2021-11-11 10:00:45+00:00" but i get this error : ``` near "10": syntax error``` should parse or do something to it ?? – ossama assaghir Nov 11 '21 at 10:57
  • 1
    cur.execute(f"SELECT * FROM tweets_b_db WHERE tweet_created_at > '{li}'") Watch out for sql injection though, not a good approach https://stackoverflow.com/questions/13613037/is-this-python-code-vulnerable-to-sql-injection-sqlite3 – Ammar Aslam Nov 11 '21 at 11:07
  • thanks to you ammar there is no errors anymore but it returns " [] " i added >= so if no tweets added to get the last date but it returns nothing – ossama assaghir Nov 11 '21 at 11:26
  • I'm seeing that you are trying to compare dates, a string comparison will not work probably, these answers would be helpful https://stackoverflow.com/questions/1975737/sqlite-datetime-comparison – Ammar Aslam Nov 11 '21 at 11:34
  • very helpful answers , apparently sqlite does not have a storage class set aside for storing dates and/or times , i tried the datetime and strftime functions but no results i can't change the date format to only YYYY/MM/DD because i want to filter by hours – ossama assaghir Nov 11 '21 at 12:00
  • Did you try select * from table_1 where mydate > Datetime('{li}') – Ammar Aslam Nov 11 '21 at 12:45
  • Yes tried it but returns nothing "[]" – ossama assaghir Nov 11 '21 at 13:20
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/239120/discussion-between-ammar-aslam-and-ossama-assaghir). – Ammar Aslam Nov 11 '21 at 13:30
0

If you use an f-string for the sql statement:

cur.execute(f"SELECT * FROM tweets_b_db  WHERE tweet_created_at >= Datetime('{data}')")

I'm sure your code will work.

But, the recommended way to pass parameters is with ? placeholders:

cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at >= Datetime(?)", (data,))

Also, if data has the correct datetime format yyyy-mm-dd hh:MM:ss then you don't need the function DATETIME():

cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at >= ?", (data,))
forpas
  • 160,666
  • 10
  • 38
  • 76
  • thanks for your answer , it strangly worked with SELECT * FROM tweets_b_db WHERE Datetime(tweet_created_at) < '{data}' giving only the column the datetime function – ossama assaghir Nov 11 '21 at 13:48
  • @ossamaassaghir The string `"SELECT * FROM tweets_b_db WHERE Datetime(tweet_created_at) < '{data}'"` without an `f` in front does not work. – forpas Nov 11 '21 at 13:51
  • it did work for me ! – ossama assaghir Nov 11 '21 at 14:16
  • @ossamaassaghir no it does not work. You may get results but for the wrong reason. Check this: https://dbfiddle.uk/?rdbms=sqlite_3.27&fiddle=edc70b1f4293c781097b350ef16dc45a – forpas Nov 11 '21 at 14:27