> > > On Tue, Jul 29, 2014 at 1:25 AM, John Doe <jdmls at yahoo.com> wrote: > > > From: Benjamin Smith <lists at benjamindsmith.com> > > > > > Thanks for your feedback - it's advice I would have given myself just a > > > few years ago. We have *literally* in the range of one hundred million > > > small PDF documents. The simple command > > > > > > find /path/to/data > /dev/null > > > > > > takes between 1 and 2 days, system load depending. We had to give up on > > > rsync for backups in this context a while ago - we just couldn't get a > > > "daily" backup more often then about 2x per week. > > > > What about: > > 1. Setup inotify (no idea how it would behave with your millions of > files) > > > > 2. One big rsync > > 3. Bring it down and copy the few modified files reported by inotify. > > > > Or lsyncd? > > > > > On Tue, Jul 29, 2014 at 12:02 PM, Cliff Pratt <enkiduonthenet at gmail.com> > wrote: > >> rsync breaks silently or sometimes noisily on big directory/file >> structures. It depends on how the OP's files are distributed. We organised >> our files in a client/year/month/day and run a number of rsyncs on >> separate >> parts of the hierarchy. Older stuff doesn't need to be rsynced but gets >> backed up every so often. >> >> But it depends whether or not the OP's data is arranged so that he could >> do >> something like that. >> >> Cheers, >> >> Cliff >> >> _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos >