[CentOS] OT? File order on CentOS/Samba server -- SOLVED (kind of...)

Fri Jan 23 20:57:54 UTC 2009
Miguel Medalha <miguelmedalha at sapo.pt>

> Did you consider sharing a directory from the machine running distiller 
> and cifs-mounting it on the linux side to get ntfs behavior?
That is out of question. The Windows machines are graphic workstations 
which are not all connected all the time and the Distiller service is 
essential to the network.

>    Also, I'm curious about the timing of the runs.  It doesn't sound like the file 
> operations are grouped atomically.  How do you ensure that the whole set 
> is present when distiller starts, or that only one set is present?
This is a very peculiar implementation. As I said om my first post, we 
are a newspaper and, as all newspapers, we don't have a fixed time to 
close the edition. It closes when it is ready, that's all.

The PDFs for print are automatically produced one by one from PostScript 
files. The PS files fall on a folder watched by Acrobat Distiller and 
after being stable for more than 10 seconds the conversion begins. Each 
one contains only one page, which will then be joined to others to form 
a plan for a platesetter.

When all the pages have been produced, one of the graphics people places 
a special text file on a folder watched by Distiller and it begins to 
bulk process all the individual PS files: downsampling images, 
converting the color space to sRGB, consolidating font subsets, creating 
bookmarks and indexes, etc. The result is a multipage PDF for electronic 
distribution, containing the whole newspaper in the sRGB color space.

This always worked flawlessly until some days ago I replaced the win2k 
server with a new CentOS/Samba one. Everything worked better and faster 
except... the pages on this last PDF were in what seemed like an 
aleatory order. Ordering them by hand is a time consuming and error 
prone process, specially when everybody is now tired... Producing a 
newspaper is a pretty tense work, you know.

The difficulty with the scripted solutions proposed here is that we 
cannot know in advance at what time this process will take place and 
what the number of pages involved will be. At the end of each issue 
every minute counts. A watching process would have to poll the status of 
the workflow for several hours with very small intervals, which would be 
a waste of  processor cicles. And not a very elegant thing to do, I feel.


I am (for now...) convinced that the tip given to me here about 
dir_index and the use of  fsck -fD will solve this problem.
 Monday I will know. It will be a loooong wait for me.

Thank you again.