On 12/14/19 7:28 PM, H wrote: > I have pdftotext 0.26.5, the current version for CentOS 7 and the Mate desktop as far as I can ascertain. The page https://www.xpdfreader.com/pdftotext-man.html seems to suggest that the latest version is 4.02 which seems a gigantic leap ahead. > > Since I have a Chinese text PDF which I am unable to extract any text from using pdftotext, instead I end up with a collection of garbage Latin characters, I am curious how to get a later version? Copying and pasting from Atril 1.16.1 (seems to be part of the Mate desktop I am running) also makes me end up with garbage... Not surprising since it also seems to use pdftotext 0.26.5... > > Any suggestions? Later version of pdftotext? If so, wherefrom? Another PDF-viewer? pdftotext is distributed as part of the poppler package, which as you suggest is at 0.26.5. However, the latest version of poppler is 0.83.0. And the man page for pdftotext on EL7 suggests it is at version 3.03, which is not quite so dramatic a difference. In any case, welcome to the joys of running an enterprise distribution. You'll find newer versions in EL8 or Fedora. It's an integral core component of the system so generally not updated lightly. -- Orion Poplawski Manager of NWRA Technical Systems 720-772-5637 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane orion at nwra.com Boulder, CO 80301 https://www.nwra.com/