start > dave > experiments > Net Flows > 2009-04-23 > 1

2009-04-23 #1

Created by dave. Last edited by dave, 15 years and 5 days ago. Viewed 2,474 times. #3

[diff] [history] [edit] [rdf]

labels

attachments

Inserts got slow

So I made two mistakes:

First I turned on indexing around the 65 million record mark. Indexing took 45 minutes, during which time the inserts totally stopped. Inserts after indexing was turned on took much longer due to wait-on-IO issues on commit, probably due to index updating.

Second, I didn't optimize Postgres. This is the information I got second hand regarding a mail/posgres server with 2GB of RAM:

Set effective_cache_size to about 1GB.
Increase shared_buffers from 48MB to 96MB.
Set checkpoint_segments to 16.
Set wal_buffers to 512KB.
Set default_statistics_target to 50.

So I made those changes:

effective_cache_size = 131072           # typically 8KB each 131072x8KB=1GB
shared_buffers = 40000                  # min 16 or max_connections*2, 8KB each
checkpoint_segments = 16                # in logfile segments, min 1, 16MB each
wal_buffers = 64                        # min 4, 8KB each
default_statistics_target = 50          # range 1-1000

Rerunning the import now, we'll see what happens.

Later: More Changes

changed the running loop so we insert flow files in chronological order instead of random
changed the running loop so that we read 12 flow files at a time
changed the internal logic so we only commit once per run (ie once per 12 flow files read)

Put those together, and right now I am on day 5 of 25, inserting at a rate of one hour (337000 flows/hour) every 3.5 minutes.

no comments | post comment

see also:
dave	CentOS Install P...	Net F...	Crappy LED Christmas L...
snipsnap-search	Net F...	Netscreen	net+f...
Juniper	dave	arpwatch	Net+F...
January ...

2009-04-23 #1

Inserts got slow

Later: More Changes

Virtual Dave Megaplex:

Useful: