Arthur Sale writes
> From thence it finds the pdf files and indexes them as well.
if such pdf files are indexable. From correspondence,
I understand that Google use pdftohtml, see
http://pdftohtml.sourceforge.net/
to extract text out of PDF files. Archive managers should
ensure that pdftohtml does a reasonable job on their PDF
files. Fortunately, this is easy because pdftohtml is
open source software, just as Eprints is.
Cheers,
Thomas Krichel mailto:krichel_at_openlib.org
http://openlib.org/home/krichel
RePEc:per:1965-06-05:thomas_krichel
Received on Wed Dec 15 2004 - 15:43:50 GMT