Дискуссионный Петрофизический Форум - Petrophysics Forum PETROPHYSICS & INTERPRETATIONS FORUM
форум по петрофизике



Ближайшие конференции (условия участия и обзор) в разделе [РАЗНОЕ]

Полезные ссылки размещены внизу

Все посетители приглашаются к участию в обсуждениях (в форме вопросов, предложений, реплик и полемических замечаний)

 
On-line: гостей 0. Всего: 0 [подробнее..]
администратор




Зарегистрирован: 24.05.05
Рейтинг: 4
ссылка на сообщение  Отправлено: 28.03.09 08:16. Заголовок: Выкопировка из community Gigapedia


Hello Gigapedia-Community,

even before I signed up for Gigapedia, I already collected many ebooks. Now I've got more than 150 GB of scanned ebooks
saved in PDF, mostly OCR'ed.

I've sorted them as good as possible to subfolders as biology, chemistry, social...but it's going more and more difficult
to find any book.

Do you use indexersoftware like Google Desktop Search, Spotlight OS X, Windows Search or any other similar program?

At the moment Spotlight is indexing my whole library, but after 5h it still says that it will take further 100h hours. I'm
going to believe that Desktop Search programs are not that efficient for indexing and searching a pdf library.

What are your experience?

bv5 2009-03-24 12:43 send a report to staff
Take a look at ISBN tool from alif98 @ http://gigapedia.com/threads/628

It will help in naming your books in a consistent manner. Will not help you search.

regards
bv

spl1nt4 2009-03-24 13:53 send a report to staff
Hi bv5,

I'm in a same position as you, with about the same size library. I have tried a number (most of them) of index tools, but
until recently, could not find one that indexed chm files properly.

I have been experimenting with the trial version of Archivarius 3000, available at
<noindex>
hxxp://www.likasoft.com/index.shtml
</noindex>

It has so far passed all of the tests I have thrown at it and I may purchase it in the next week or so.

Does anyone else here have any experience with this particular package?

Spl1nt4


pdfetc 2009-03-24 18:06 send a report to staff
I have tried many and keep coming back to google desktop + ScanSoft OmniPage Search Indexer. Once you add this plugin,
it automatically ocr's your files and allows you to search better. Google desktop's search interface is not that grate but
you can add gdSuite plugin to help with that

Really the omnipage plugin (which is free of course) is miraculous.

If only I could figure out how to make google desktop scan and search only one folder (with the pdfs) and not the whole
freaking desktop, it would be great. You can exclude folders, but as far as I can tell you can't direct it to do only one
folder.

montana 2009-03-24 18:48 send a report to staff
Now Spotlight, the OS X intergrated search engine, says that it has approx. 2000h to remain :S

unbelievable. The indexfile already reached 2GB.

I think that the remaining time is wrong calculated, let's see. I'll report more details when it's finished, see you next
week :D

By the way, the search engine does also index the content, not just the file name and their tags.

montana 2009-03-24 19:04 send a report to staff
pdfetc: may i've got a solution for you.

I do it like that: I store my ebooks on a second disk. My second disk is a external drive, cause of slow indexing I opened
it and plugged it directly to the mainboard.

I just enabled indexing on the second drive and excluded the whole home folder (c:/). I also managed that the index file
will be directly written on the second HD, where all the pdf's are.

Since he will be finished indexing, I'll plug the HD back in the ext. case.

Every time I'll plug in the ext. HD, Spotlight, or in your example Google Desktop search, will use the Indexfile directly
on the disk.

So, you should exlude the whole disk, except the disk containing the books. That should work. Please tell me.

Sincerely
Montana

exib 2009-03-25 05:47 send a report to staff
Were you aware that GDS indexes only the first 10,000 words of any given file?

Go to http://desktop.google.com/support/bin/answer.py?hl=en&answer=17208
Click on "Try searching for a word that appears at the beginning of the PDF."

Copernic apparently allows you to set number of words to an arbitrary number. (I haven't tried this, but I emailed them,
and that's what they told me.)

seeker1 2009-03-25 07:06 send a report to staff
I have thousands of research articles, ebooks, ppt etc..and have been using copernic software, which indexes almost
everything. Try it and think it will solve your problem
Regard

rinker 2009-03-25 09:07 send a report to staff
Consider getting an account at Google Books, which will allow you to take advantage of the fact that Google has already
digitized many books. With your account, add those books that are in your collection. I believe you will be able to search
for text just amongst your book collection - I haven't tried that though. Google has not scanned all books, so before you
commit time to this, verify that you can search just within your own library at Google, and do a pilot study by taking
about 25 of your books at random, search for a part of a sentence, and see if an acceptable percentage have been scanned by
Google.

terminator 2009-03-26 02:36 send a report to staff
(sorry for interrupting this discussion, i just want to warn users to be careful what are you posting, otherwise we are
closing discussions for newbies! staff can not spend half day editing and deleting your posts. thanks.)


tutut 2009-03-26 12:44 send a report to staff
I am trying "archivarius" cited by spl1nt4; seems very good.


oldroad 2009-03-27 11:09 send a report to staff
I using Avafind, hight perfomance and easy using, but this software was not continue develop from 2004.
***snip***
i also using google search desktop, here is my experiment:
- Avafind:
Advance: Faster Only using for search file and folder( Pro version can using for index Device Network)
Disadvance: Can't index content
Google Desktop:
Advance: Can index content of document, good for search file and folder
Disavance: Take a long time for indexing
Hope it help ^^

***staff edit***
please dont post links here, post them rather on gigle.ws, and then paste gigle link here! Thanks


mr-x 2009-03-27 17:30 send a report to staff
With a pumped-up quad core, I find the default search in windows vista to be incredibly fast...almost instantaneous;
however, I don't know how many pages it searches within each document and I imagine it would rapidly slow down with an
older computer.

The new search within vista is my favourite part of that operating system by far.

I used to use copernic for years as I love the preview ability and the fact searches were very rapid even on old
computers. The draw back, of course, was that at the time it couldn't easily handle CHM files, but I think there may be
plugins by now for that. Another drawback was that the index woudl get very large even with a modest collection of books.

nahidh-ebooks 2009-03-27 18:31 send a report to staff
I am in a position similar to yours. If you are ready to convert all of your ebooks to PDF format, then Adobe Acrobat
Professional is suitable for what you want since by then you could use the catalog feature therein. However, you would have
to be ready to do the following: (1) Pay the cost of the professional version of acrobat, (2)Spare about double the hard
space that you already spared since you will love to keep the original formats untouched, and the last one is going to be
the less costly one: (3)You are going to pay the indexing time cost once and before using the catalog feature. Good luck,
and I would like to know about result if you adopted the plan.

<noindex>
http://gigapedia.com/threads/1272?page=2
</noindex>


C уважением и надеждой на понимание Спасибо: 0 
Профиль Цитата Ответить