HEATH Digest - 12 Jan 2001 to 13 Jan 2001 (#2001-14)

Mike Morris morris at COGENT.NET
Sun Jan 14 23:27:58 EST 2001


At 05:37 PM 01/14/2001, you wrote:
> > Mike Morris wrote:
> > > produce the files to Wayne's specifications if somebody else
> > > has a scanner and can produce TIFF files (normal or
> > > compressed) of each page.  The program that comes with
> > > HP Scanjets can do this quite well.
>
> >From my (limited, admittedly) experience with this, Acrobat files are
> amazing
>compact while still preserving the original format of the document. TIFF files
>are monsterously large  in comparison, and I would find them less desirable
>myself. Also, there is no capability of indexing in TIFF (it is a image
>format, after all) in contrast to .PDF files,which (I think) allow the
>creation of an index. Another excuse for me to learn Acrobat in more detail,
>I suppose.

I guess I did not make myself plain.
The TIFF is an image file that I would use as an input to the OCR system to
produce the .txt file and from that a PDF file.  Once I was done I'd delete
the TIFF.  I never suggested distributing the TIFFs anywhere except from the
person who has the autofeed scanner to me.

>Another person was mentioning flat files - that is a lot more work to do,
>and while scanning 60+ pages in via Acrobat would require some time
>investment, the probablily of my having enough time to type in 60+ pages of
>parts is... zero.

Which is why the process I was suggesting is to take the original document,
scan it to TIFFs and get them to me via Jazz, Zip, CDR or FTP.  I could OCR
them and produce the PDF file for distribution.

>The other option is to scan the information in via OCR. Unfortunately,
>OCR is only about 98% accurate - and depending on the format we scan to,
>we still will need to deal with the fonts and formating issue that .PDF deals
>with seamlessly.

It's only one font, and if the original document is anything like the one I saw
at the Anaheim Calif and Los Angeles Calif Heath stores over 25 years ago it's
a mainframe line printer font.
The OCR package I use inserts a ~ character in place of anything it has a
problem
with, and all I have to do to see if there were problems is open the output
file
with Notepad and scan for the ~ character.

>Any, I have dropped a line to John Parker regarding this... perhaps the first
>thing to do is to get a copy of a couple of pages of the document in question
>and perform a test of Acrobatization (yes, I just made that word up :^> )
>and see what people think of the results.

If you have a way to take a original and go to Acrobat directly, go for it.

>73, Dave N4DJS

Mike WA6ILQ


Shop online without a credit card
http://www.rocketcash.com
RocketCash, a NetZero subsidiary

Listserver Subscription:listserv at listserv.tempe.gov - "subscribe heath 'name' 'call'"
Listserver Submissions: heath at listserv.tempe.gov
Listserver Unsubscribe: listserv at listserv.tempe.gov - -"signoff heath"




More information about the Heath mailing list