We have been fortunate to hang onto one of our summer interns for part time work on weekends during the current school year. One of the intern's jobs is to load documents and data which are then processed. The documents are .txt, .docx, and .pdf files. The data files are raw sensor outputs usually captured using ADCs mostly with eight bit precision. All files are loaded or moved from one machine to another with sftp.
The intern noticed right a way that the documents will transfer perfectly from our PPC and SPARC machines to our Intel/CentOS platforms. The raw data files, not so much. There is always an Endian (Thanks Gulliver) issue, which we assume is due to the bytes of data being formatted into 32 bit words somewhere in the Big Endian systems. It is not totally clear why the document files do not have this issue. If there is a known principle behind these observations, we would appreciate very much any information that can shared.