All you have to do is download this Engines.txt file, and place it in the fixed location C:\WordsEx\Engines.txt, and it will override WordsEx.exe's built-in list. You will want to pay attention which engines return garbage or irrelevant results, or in foreign languages or take a long time, and manually edit them out of the list. ============================================================= This is what I have done, as of June 27, 2008: - 1. Rehabilitated quite a few old engine URLs. - 2. Queried and analyzed some new engine URLs. (Way more yet to try.) - 3. Studied a search of "Yeats Second Coming". - 4. Commented out various "bad" search engines. - 5. Gave the best search engines lower ordinals. - 6. Commented out slow/weak/foreign/etc. engines. That should give sparkling clean and good results. ============================================================= ; Gigablast.com found 75 pages that no other engine found. Good Relevance. ; However, that 75 includes 31 from Gigablast categories, only 4 relevant. ; Adding this bad url rule before the good url rule should have rid those. ; Even so, for now, Gigablast is my top performer. Low ordinal goes first. GET 100 http://www.gigablast.com/search?n=100&q= bad url has anchor DOMAIN "dir.gigablast.com" good url has next TAG span ATTR clear VALUE "result-link" more url has anchor TWO TOKENS "Next" NUMBER ; About.com found 56 pages that no other engine found. Good Relevance. GET 110 http://search.about.com/fullsearch.htm?TopNode=/&terms= good url has prior TAG div ATTR clear VALUE "res" more url has anchortext NUMBER ; Entireweb.com found 50 pages that no other engine found. Good Relevance. GET 120 http://www.entireweb.com/query?q= good url has anchortext "Details" more url has anchortext NUMBER ; Ask.com found 48 pages no other engine found. But 19 were from Ask cache. ; I'll need to tune up these parsing rules to avoid those cached ones. ; Even so, until then, this engine produces very good results. GET 130 http://www.ask.com/main/AskJeeves.asp?ask= good url has next TAG div ATTR onsrc more url has anchortext NUMBER ; BBC.co.uk found 35 pages no other engine found, 37 within own web site. ; However, their BINARY OR dilutes the results for multi-word searches. ; In general, I commented out all BINARY OR search engines. You decide. ; GET 140 http://www.bbc.co.uk/cgi-bin/search/results.pl?go=homepage&scope=all&tab=all&Search=&q= ; good url has next TAG /h3 ; more url has anchortext NUMBER ; Findarticles.com found 34 pages no other engine found, in own site, but good. GET 150 http://www.findarticles.com/p/search?qt= good url has next TAG cite more url has anchortext NUMBER ; Lycos.de found 33 pages no other engine found, but all were in German. ; So I will give it a low ordinal, but comment it out. You may re-add it. ; GET 160 http://suche.lycos.de/cgi-bin/pursuit?enc=utf-8&query= ; good url has anchor TAG a ATTR clear VALUE "result" ; more url has anchortext NUMBER ; Slashdot.org found 30 pages no other engine found, only in own site. ; However, their BINARY OR dilutes the results for multi-word searches. ; In general, I commented out all BINARY OR search engines. You decide. ; GET 170 http://science.slashdot.org/search.pl?query= ; good url has prior TWO TOKENS "Score:" NUMBER_POINTS ; This Yahoo URL found 30 pages no other engine found, good relevance. GET 180 http://search.yahoo.com/search?ei=UTF-8&fr=yfp-t-501&cop=mss&p= none good until TWO TOKENS "WEB" "RESULTS" good url has anchor TAG a ATTR clear VALUE "yschttl" more url has anchortext NUMBER ; Bigblog.com found 23 pages no other engine found. ; However, their BINARY OR dilutes the results for multi-word searches. ; In general, I commented out all BINARY OR search engines. You decide. ; GET 190 http://www.bigblog.com/search.cgi?terms= ; good url has next TAG blockquote ; more url has anchortext NUMBER ; h-net found 20 pages no other engine found, in own msu.edu site, but good. GET 200 http://www.h-net.org/multisearch.php?Submit=&searchtype=Discussion&searchquery= good url has anchor PATH "/cgi-bin/logbrowse.pl" more url has anchortext NUMBER ; clix.pt found 20 pages no other engine found, in english, and relevant. ; PT=Portugal sounds exotic, slow, foreign, but wait for it's good stuff. GET 210 http://pesquisa.clix.pt/resultado.html?t=Homepage&s=/index2.html&c=pesquisa&in=Mundial&ok=&question= good url has prior TAG td ATTR clear VALUE "verdana11cinza" more url has anchortext NUMBER ; News.google.com found 19 pages no other engine found. ; However, their BINARY OR dilutes the results for multi-word searches. ; In general, I commented out all BINARY OR search engines. You decide. ; GET 220 http://news.google.com/news?hl=en&lr=&sa=N&tab=in&q= ; good url has prior TAG td ATTR clear VALUE "j" ; more url has anchortext NUMBER ; Reference.com and Virgilio.it are next because both returned the same ; 18 pages in common that no other engine found; and both returned 6 more ; that no other engine found; But, since those 6 from Virgilio were in ; Italian, I will comment it out. But if you speak Italian, restore it. GET 230 http://www.reference.com/search?db=web&q= none good until TWO TOKENS "Search" "took" good url has next TAG /td more url has anchortext NUMBER none good after anchortext NUMBER ; Again, those 6 in Italian were very relevant. Restore if you speak it: ; GET 240 http://search.virgilio.it/search/cgi/search.cgi?lr=&offset=0&hits=10&switch=0&f=hs&qs= ; good url has anchor TAG a ATTR clear VALUE "link16" ; more url has anchortext NUMBER ; Websquash.org found 17 pages no other engine found, good relevance. GET 250 http://www.websquash.org/ssearch/smartsearch.cgi?DoSearch=&keywords= good url has prior TAG li ATTR vlink more url has anchortext "Next Results" ; Who would not expect Google to rank first? But their default form only ; returns 10 hits per query, so it is slow to add up. I could change that ; to 100 hits per query, but I will not; let them remain here, for google ; and clix.pt got 16 in common, and google another 10, found by no other. GET 260 http://www.google.com/search?hl=en&ie=ISO-8859-1&btnG=Google+Search&q= good url has anchor TAG a ATTR clear VALUE "l" more url has anchortext NUMBER ; Teoma.com found 15 pages no other engine found, good relevance. GET 270 http://s.teoma.com/search?submit=&qcat=1&qsrc=1&q= more url has anchortext NUMBER good url has prior TOKEN NUMBER+"k" ; The time to deal with altavista is now, for their 13 unique hit pages. ; But--Altavista.com manipulated Yeats into yeast to return sales spam; ; Also--Altavista.com returned "second coming" without "Yeats" results. ; So, altavista has no place in this list among the other good engines. ;n/g: GET 280 http://www.altavista.com/web/results?itag=wrx&kgs=0&kls=0&q= ;n/g: good url has anchor TAG a ATTR clear VALUE "res" ;n/g: more url has anchortext NUMBER ; Avoid overstock.com: A few hits, but far too much irrelevant sales spam. ;n/g: GET 290 http://www.overstock.com/cgi-bin/d2.cgi?page=search&keyword= ;n/g: more url has anchortext NUMBER ;n/g: good url has next TWO TOKENS "Our" "Price:" ; Allexperts.com found 12 pages no other engine found, in own site: Q/A style. GET 300 http://en.allexperts.com/sitesearch.htm?Action=&terms= good url has prior TAG div ATTR clear VALUE "res" more url has anchortext NUMBER ; Deal with Searchforit.com now. They found 11 pages no other engine found. ; However, that's all they found; They were all referrals to other search ; portal URLs that I could have queried myself, but a few even sounded bad. ;n/g: GET 310 http://www.searchforit.com/results.html?aff_id=&cat=&searchbutton=&keywords= ;n/g: good url has anchor TAG a ATTR clear VALUE "resultlink" ;n/g: more url has anchortext NUMBER ; This Yahoo URL found 11 pages no other engine found, in various languages. ; Later I got all french results, so I'll comment it out. You may re-add it. ; GET 320 http://fr.search.yahoo.com/search?ei=ISO-8859-1&fr=cb-ovb&sa=&p= ; good url has anchor TAG a ATTR clear VALUE "yschttl" ; more url has anchortext NUMBER ; Sympatico.msn.ca found 11 pages no other engine found, good relevance. GET 330 http://search.sympatico.msn.ca/results.aspx?q= none good until TAG div ATTR label VALUE "results" good url has prior TAG h3 more url has anchortext NUMBER ; Cherchons.com found 7 pages no other engine found, in several languages, ; esp. French, but none highly relevant to this Yeats-Second-Coming query. ; Later I got all french results, so I'll comment it out. You may re-add it. ; GET 340 http://www.cherchons.com/cgi/cgi.cgi?action=cherchons&start=0&cherchons=&toSearch= ; good url has prior TAG img ATTR summary VALUE "/images/flecheFavorie2.gif" ; more url has anchor TAG img ATTR summary VALUE "/images/suivant.gif" ; Ixquick.com found 7 pages no other engine found, but they were all to ; Wiki second_coming topics. Wait, do those pages have Yeats relevance? ; Yes! So Ixquick is at least somewhat of a hero, not a zero. Keep it. GET 350 http://ixquick.com/do/metasearch.pl?cat=web&cmd=process_search&language=english&query= good url has next TAG img ATTR summary VALUE "graphics/star.gif" more url has anchortext NUMBER ; Search.com found 6 pages no other engine found. One was a spam-bait ; page mixing yeast with yeats; but the other 5 pages were very good. GET 360 http://www.search.com/search?q= good url has anchor PATH "/click" more url has anchortext NUMBER ; Voila.fr found 4 pages no other engine found. They were relevant, but ; in French. I will comment it out. But if you speak French, restore it. ; GET 370 http://r.voila.fr/se?sev=2&ref=V_BOX_essentiel&db=web&dblg=fr&ctx=voila&lg=fr&dt=*&kw= ; good url has prior TAG div ATTR clear VALUE "lr_bloc" ; Ithaki.net found 4 pages no other engine found. However, those were ; all portal top pages, like msn.com. So, a dilution of search results. ; GET 380 http://www.ithaki.net/metasearch.cgi?where=web&Search=&query= ; good url has anchor TAG a ATTR title VALUE "_blank" ; more url has anchortext NUMBER ; Wn.com found 4 pages no other engine found. All technically relevant, ; the pages having Yeats as a race-horse name. So, welcome on board: GET 390 http://upge.wn.com/?version=1&template=cheetah-search/index.txt&language_id=1&query= good url has prior TAG div ATTR clear VALUE "resultsource" ; Monstercrawler.com found 4 pages no other engine found. However, they ; were all within about.com, along with some pages there both reported. ; Since this meta-engine query takes 12 seconds, and my search seemed ; to have been cut short at 9, not 99 iterations, maybe about.com would ; eventually find those pages. Therefore, I'll omit monstercrawler.com. ; GET 400 http://search.monstercrawler.com/monster/ws/redir?qcat=web&qkw= ; good url has anchor TAG a ATTR clear VALUE "resultsLink" ; more url has anchortext NUMBER ; Metacrawler.com found 2 pages no other engine found, of dubious relevance. ; That's not worth the long wait typical of a meta-engine query response. ; GET 410 http://www.metacrawler.com/info.metac/search/redir.htm?qcat=web&qkw= ; good url has anchor TAG a ATTR clear VALUE "resultsLink" ; more url has anchortext NUMBER ; Business.com found 2 pages no other engine found; However, they twisted ; the Yeats query into yeast to pass sales-spam related queries to google ; ad* URL. Avoid business.com unless you are shopping for something, etc. ;n/g: GET 420 http://www.business.com/search/rslt_default.asp?vt=all&search=&type=web&query= ;n/g: good url has next TAG span ATTR clear VALUE "url" ; Welt.de found 1 page no other engine found, It might be relevant, but ; in German. I will comment it out. But if you speak German, restore it. ; GET 430 http://www.welt.de/archiv/?se=&search.execute=true&lucyStemmed=1&lucyField=2&lucySection=21&lucySort=1&lucyMaxNumberResultsSorted=500&lucyOptimized=false&lucyExpr= ; good url has next TWO TOKENS "-" NUMBER_POINTS+"," ; more url has anchortext NUMBER ; Terra.es found 1 page no other engine found, It was relevant, mostly in ; Spanish. I will comment it out. But if you speak Spanish, restore it. ; GET 440 http://buscador.terra.es/default.asp?loc=searchbox&ca=c&query= ; good url has anchor TAG a ATTR clear VALUE "fuenteUrl" ; more url has anchortext NUMBER ============================================================= The remaining engines returned no results not returned by one of the engines listed (or commented out) above. So offer then retests on other queries later, but for now, comment them out: ============================================================= ;weak-- GET 701 http://search.ninemsn.com.au/results.aspx?q= ;weak-- good url has next TAG li ATTR clear VALUE "dispUrl" ;weak-- more url has anchortext NUMBER ;weak-- GET 702 http://search.latino.msn.com/results.aspx?q= ;weak-- good url has next TAG li ATTR clear VALUE "dispUrl" ;weak-- more url has anchortext NUMBER ;weak-- GET 703 http://search.msn.com/results.aspx?q= ;weak-- good url has next TAG li ATTR clear VALUE "dispUrl" ;weak-- more url has anchortext NUMBER ;weak-- GET 704 http://g.msn.com/0nwenus0/AD/16?cp=1252&submit1=&FORM=AD&q= ;weak-- good url has next TAG li ATTR clear VALUE "dispUrl" ;weak-- more url has anchortext NUMBER ;weak-- GET 705 http://xslt.alexa.com/cgi-bin/search_form?submit=&term= ;weak-- good url has anchor TAG a ATTR clear VALUE "small G" ;weak-- more url has anchortext NUMBER ;weak-- GET 706 http://www.surcha.com/search.sa?searchstr= ;weak-- good url has anchor PATH "/click.sa" ;weak-- more url has anchortext "next page" ;weak-- GET 707 http://search.netscape.com/ns/search?st=webresults&fromPage=NSCPIndex&x=9&y=11&query= ;weak-- good url has anchor TAG a ATTR clear VALUE "find" ;weak-- more url has anchortext NUMBER ;weak-- ; On probation: ilse returned one huge spam-bait page. ;weak-- GET 708 http://search.ilse.nl/web?search_for= ;weak-- good url has next TAG /h3 ;weak-- more url has anchortext NUMBER ;weak-- GET 709 http://a9.com/?submit=&q= ;weak-- good url has prior TAG h2 ;weak-- more url has anchortext "next" ;weak-- ;weak-- I assigned a few ordinals below the original 100, so they query first. ;weak-- Usually, you ignore the query result page, and just read the hit pages. ;weak-- But in these cases, you want to read the query result pages themselves. ;weak-- ;weak-- GET 710 http://www.google.com/search?hl=en&ie=ISO-8859-1&btnG=Google+Search&q=define: ;weak-- good url has prior TAG li ;weak-- keep query result page as a good web page for reading ;weak-- GET 711 http://en.wikipedia.org/wiki/Special:Search?go=&search= ;weak-- keep query result page as a good web page for reading ;weak-- ; Avoid nextag.com: manipulated Yeats into yeast to return sales spam: ;weak-- ; GET 712 http://www.nextag.com/KEYWORD/search-html?search= ;weak-- ; keep query result page as a good web page for reading ;weak-- GET 713 http://answers.yahoo.com/search/search_result?p= ;weak-- more url has anchortext NUMBER ;weak-- keep query result page as a good web page for reading ;weak-- GET 714 http://aolsearch.aol.com/aol/news?invocationType=newsChannel&category=ns_top_collection&query= ;weak-- more url has anchortext NUMBER ;weak-- keep query result page as a good web page for reading ;weak-- GET 715 http://search.monstercrawler.com/monster/ws/redir?qcat=web&qkw= ;weak-- good url has anchor TAG a ATTR clear VALUE "resultsLink" ;weak-- more url has anchortext NUMBER ;weak-- GET 716 http://extra.volkskrant.nl/zoek/index.php?sa=&as_sitesearch=het internet&q= ;weak-- bad url has anchor DOMAIN "google.com" ;weak-- good url has prior TAG span ATTR clear VALUE "rUrl" ;weak-- more url has anchortext NUMBER ;weak-- GET 717 http://newssearch.bbc.co.uk/cgi-bin/search/results.pl?scope=newsukfs&tab=news&q= ;weak-- good url has anchor TAG a ATTR clear VALUE "searchresult" ;weak-- more url has anchortext "Next" ;weak-- GET 718 http://search.aol.com/aolcom/search?invocationType=topsearchbox./aolcom/index.jsp&query= ;weak-- good url has next TAG p ATTR clear VALUE "durl find" ;weak-- more url has anchortext NUMBER ;weak-- GET 719 http://search.lycos.com/?query= ;weak-- good url has anchor TAG span ATTR clear VALUE "large" ;weak-- more url has anchortext NUMBER ;weak-- GET 720 http://www.newsday.com/search/dispatcher.front?target=article&Query= ;weak-- good url has next TAG span ATTR clear VALUE "byline" ;weak-- GET 721 http://www.welt.de/archiv/?se=&search.execute=true&lucyStemmed=1&lucyField=2&lucySection=21&lucySort=1&lucyMaxNumberResultsSorted=500&lucyOptimized=false&lucyExpr= ;weak-- good url has next TWO TOKENS "-" NUMBER_POINTS+"," ;weak-- more url has anchortext NUMBER ;weak-- GET 722 http://search.netscape.com/ns/search?fromPage=nsnewssearch&query= ;weak-- good url has anchor TAG a ATTR clear VALUE "find" ;weak-- more url has anchortext NUMBER ;weak-- GET 723 http://search.internet.com/query.php?IC_QueryText= ;weak-- good url has next TWO TOKENS "-" NUMBER+"k" ;weak-- more url has anchortext NUMBER ;weak-- GET 724 http://catalogo.cerca.com/odpSearch/?cerca=&where=all&charset=utf-8&qs= ;weak-- good url has anchortext "[ anteprima ]" ;weak-- more url has anchor TAG a ATTR clear VALUE "navires_pageind" ;weak-- GET 725 http://www.infotiger.com/search?OK=&qs= ;weak-- good url has next TOKEN "size:" ;weak-- more url has anchortext NUMBER ;weak-- GET 726 http://science.slashdot.org/search.pl?query= ;weak-- good url has prior TWO TOKENS "Score:" NUMBER_POINTS ;weak-- GET 727 http://www.bigblog.com/search.cgi?terms= ;weak-- good url has next TAG blockquote ;weak-- more url has anchortext NUMBER ;weak-- GET 728 http://webreference.com/r/cs?query= ;weak-- good url has next TWO TOKENS "-" NUMBER+"k" ;weak-- more url has anchortext NUMBER ;weak-- GET 729 http://www.blogged.com/search.php?type=blogs&btnsearch=&q= ;weak-- good url has prior TAG td ATTR clear VALUE "ratingpad" ;weak-- more url has anchortext NUMBER ;weak-- GET 730 http://www.bloglines.com/search?s=&imageField=&q= ;weak-- good url has prior TAG span ATTR clear VALUE "post-date" ;weak-- more url has anchortext NUMBER ;weak-- GET 731 http://www.podanza.com/search.php?keywords= ;weak-- good url has anchortext "Podcast Details" ;weak-- more url has anchortext NUMBER ;weak-- GET 732 http://groups.yahoo.com/search?query= ;weak-- good url has next TAG div ATTR clear VALUE "group-description" ;weak-- more url has anchortext "Next >" ;weak-- GET 733 http://search.live.com/results.aspx?mkt=en-US&form=SPACEL&q= ;weak-- good url has next TAG ul ATTR clear VALUE "sb_meta" ;weak-- more url has anchortext NUMBER ;weak-- GET 734 http://www.resourceshelf.com/?s= ;weak-- good url has prior TAG h3 ;weak-- GET 735 http://books.google.com/books?btnG=&q= ;weak-- good url has anchortext "About this book" ;weak-- more url has anchortext NUMBER ;weak-- GET 736 http://bureau-query.funnelback.com/search/search.cgi?collection=ansto&query= ;weak-- good url has next TOKEN NUMBER+"Kb" ;weak-- more url has anchortext NUMBER ;weak-- GET 737 http://clusty.com/search?input-form=clusty-simple&v:sources=webplus&query= ;weak-- good url has anchor TAG span ATTR clear VALUE "title" ;weak-- more url has anchortext NUMBER