My problem is that Django inserts entries waaaaaaay too slow ( i didnt even time but it was more than 5 mins) for 100k entries from Pandas csv file. What i am doing is parsing csv file and then save those objects to postgresql in Django. It is going to be a daily cronjob with csv files differ for most of the entries(some can be duplicates from the previous days or owner could be the same)
I haven't tried raw queries, but i dont think that would help much. and i am really stuck at this point honestly. apart from some iteration manipulations and making a generator, rather than iterator i can not somehow improve the time of insertions.
class TrendType(models.Model):
""" Описывает тип отчета (посты, видео, субъекты)"""
TREND_TYPE = Choices('video', 'posts', 'owners') ## << mnemonic
title = models.CharField(max_length=50)
mnemonic = models.CharField(choices=TREND_TYPE, max_length=30)
class TrendSource(models.Model):
""" Источник отчета (файла) """
trend_type = models.ForeignKey(TrendType, on_delete=models.CASCADE)
load_date = models.DateTimeField()
filename = models.CharField(max_length=100)
class TrendOwner(models.Model):
""" Владелец данных (группа, юзер, и т.п.)"""
TREND_OWNERS = Choices('group', 'user', '_')
title = models.CharField(max_length=50)
mnemonic = models.CharField(choices=TREND_OWNERS, max_length=30)
class Owner(models.Model):
""" Данные о владельце """
link = models.CharField(max_length=255)
name = models.CharField(max_length=255)
trend_type = models.ForeignKey(TrendType, on_delete=models.CASCADE)
trend_owner = models.ForeignKey(TrendOwner, on_delete=models.CASCADE)
class TrendData(models.Model):
""" Модель упаковка всех данных """
owner = models.ForeignKey(Owner, on_delete=models.CASCADE)
views = models.IntegerField()
views_u = models.IntegerField()
likes = models.IntegerField()
shares = models.IntegerField()
interaction_rate = models.FloatField()
mean_age = models.IntegerField()
source = models.ForeignKey(TrendSource, on_delete=models.CASCADE)
date_trend = models.DateTimeField() # << take it as a date
Basically, i would love a good solution for a 'fast' insertion to a database and is it even possible given these models.