I’m working on a project analyzing large code bases. For just messing around, I’m using Sqlite. A strange thing happened when updating ~10,000 rows: they were really slow! Sqlite was updating about 10 records a second.
It turns out Sqlite correctly cares about your data, so it makes a transaction around the UPDATE statement, verifying that each and every bit of your data hits disk before returning. For me, if disaster strikes and I lose my tiny database I can recreated in a few seconds, so I decided to live life dangerously, and tell Sqlite to go ahead and try to update data, but don’t wait around for it to hit disk.
Here’s the magic. Put this before your bulk INSERT/UPDATE statements. Note that it affects the entire session, so you don’t want to do this before your valuable data-manipulation commands.
from django.db import connection
if connection.vendor == ‘sqlite’:
Three commands sped up my program a zillion percent. The entire 10K rows updated in a fraction of a second, vs minutes. Yay!