m***@wormbase.org
2017-05-22 13:11:45 UTC
Hi,
We have an import job that takes approx. 7 hours to transact the EDN
generated from our legacy database, the resulting size on is currently
~140GB, which when backed-up and restored shrinks this down to 16GB.
The import process necessitated storing ancillary information in a custom
entity :import/temp.
Because this information is not needed after the import job has concluded,
we excise this data then attempt to reclaim storage at the end of the
process.
I'm experiencing an issue where datomic.api/gc-storage seems to never
return after performing the import job;
I'm seeing datomic.garbage :garbage/collected event in the log when I'd
expect (roughtly the same time the import job takes to run),
but the code we run to excise this temporary data (sync-index,
gc-storage) does not seem to return (I've left it running for > 3 days
before giving up waiting).
If someone could point out anything that I'm doing wrong with regard to the
strategy for performing gc-storage I'd be most grateful!
The code we run to do this can be seen here:
https://github.com/WormBase/pseudoace/blob/master/src/pseudoace/cli.clj#L319-L329
I wasn't sure of the "correct" approach, but stumbled upon:
https://hashrocket.com/blog/posts/bulk-imports-with-datomic
which is where the code was taken (verbatim!) from (seemed to make sense to
me!)
Many thanks,
Matt
We have an import job that takes approx. 7 hours to transact the EDN
generated from our legacy database, the resulting size on is currently
~140GB, which when backed-up and restored shrinks this down to 16GB.
The import process necessitated storing ancillary information in a custom
entity :import/temp.
Because this information is not needed after the import job has concluded,
we excise this data then attempt to reclaim storage at the end of the
process.
I'm experiencing an issue where datomic.api/gc-storage seems to never
return after performing the import job;
I'm seeing datomic.garbage :garbage/collected event in the log when I'd
expect (roughtly the same time the import job takes to run),
but the code we run to excise this temporary data (sync-index,
gc-storage) does not seem to return (I've left it running for > 3 days
before giving up waiting).
If someone could point out anything that I'm doing wrong with regard to the
strategy for performing gc-storage I'd be most grateful!
The code we run to do this can be seen here:
https://github.com/WormBase/pseudoace/blob/master/src/pseudoace/cli.clj#L319-L329
I wasn't sure of the "correct" approach, but stumbled upon:
https://hashrocket.com/blog/posts/bulk-imports-with-datomic
which is where the code was taken (verbatim!) from (seemed to make sense to
me!)
Many thanks,
Matt
--
You received this message because you are subscribed to the Google Groups "Datomic" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datomic+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "Datomic" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datomic+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.