[CentOS] time foo

hw hw at gc-24.de
Sat Dec 2 12:42:08 UTC 2017


John R Pierce wrote:
> On 12/1/2017 11:32 AM, hw wrote:
>> So this would mean that the database (running on a different server) takes
>> almost two times as much as foo --- which I would consider kinda excruciatingly
>> long because it´s merely inserting rows into two different tables after they were
>> prepared by foo and then processes some queries to convert the data.
>>
>> The queries after importing may take like 3 or 5 minutes.  About 4.5 million rows
>> are being imported.
>
> so you're missing about 25 minutes, and maybe 5 minutes is spent post processing, so thats 20 minutes spent in the data insertion?

Yes, with the 15 minutes actually spent on foo spent on converting
the fields and sending them to the server, which I think is pretty
good.

> inserting one row at a time?  or in batches?    remeber a database server is going to do commits after each transaction, which forces the data to be flushed to disk.   4.5 million seperate row transactions, yeah, I could see that taking some time, plus add that many network round trips, etcetc.   if the db server just has a single SATA disk, you're doing 9 million committed writes combined to the two tables?    20 minutes for 9 million inserts, thats 7500 per second.

They are inserted one row at a time, during one transaction
for each of the CSV files.  I´d have to figure out how to
insert them in batches, that might yet be faster.  I could
easily stack up 1000 rows or so and then insert them all at
once, if that´s possible.



More information about the CentOS mailing list