Web Searching (Part Two) from the November 2000 Actrix Newsletter

by Rob Zorn

Last month, you may recall, we looked at the various ways in which different search engines work, and how they rank pages for presentation to the searcher. You will recall that there were basically two types, search engines and directories. This month I want to look more closely at "true" search engines and how you can use Boolean logic and other "tricks" to save a lot of searching time.

To use search engines effectively, it is essential to apply techniques that narrow results and push the most relevant pages to the top of the results list. The main search engines that use these types of techniques are AltaVista, Northern Light, Google, Excite and Go. These search engines don't appear as "user-friendly" as search sites like the very popular Ask Jeeves, but when used properly, they can give better and much more specific results.

Read the stuff on Boolean logic slowly if you're new to searching. It isn't half as complicated as it sounds.

Identify Keywords

When conducting a search, obviously, you need to break down the topic into key concepts. For example, to find information on what is happening with Napster's battle against the Recording Industry of American Artists, the keywords might be:

          Napster      RIAA      MP3

Entering these three words together would probably be a big mistake. The search engine would return you all pages it found containing any single one of those words. This is where Boolean logic comes in so handy.

Norrie says........

Always listen to the wise geek!

"Don't be foolean,
use the Boolean!"

Boolean Logic

There are two types of Boolean logic: full and implied. Full Boolean logic uses the words AND, OR and NOT. Implied Boolean logic uses the symbols + or -.

Boolean AND: In order to make sure our search engine returns only pages that contain all three of those words, we could use implied or full Boolean logic by entering either of the following into the search field:

Implied:  Napster  +RIAA  +MP3
Full:  Napster  AND  RIAA  AND  MP3
The search engine will not return pages with just the word Napster. The search engine will only return pages where the words Napster, RIAA, and MP3 all appear somewhere within the same web page. Thus, the "Boolean AND" helps to narrow your search results as it limits results to pages where all the keywords appear.

Boolean AND NOT: AND NOT tells the search engine to retrieve web pages containing one keyword but not the other. For example if we wanted to find information on dolphins (the aquatic mammal), but not be bombarded with webpages devoted to Miami's American football team, we could use either implied or full Boolean logic as follows:

Implied: dolphins  -Miami
Full: dolphins  AND NOT Miami

The above examples instruct the search engine to return web pages about dolphins but not web pages that are likely to be about the Miami Dolphins. Use AND NOT when you have a keyword that has multiple possible meanings (such as "dolphin" in this case). The need for AND NOT often becomes apparent after you perform an initial search. If your search results contain heaps of irrelevant results (e.g., Saturn the communications company rather than Saturn the planet), try using AND NOT to filter out the undesired websites.

Boolean OR: Linking search terms with OR tells the search engine to retrieve web pages containing any, some or all keywords. We could use either implied or full Boolean logic as follows:

Implied: Including two or more words with nothing surrounding or separating them is equivalent to OR.
Full: Parrots OR budgerigars OR Cockatiels

When OR is used, the search engine returns pages with a single keyword, several keywords, or all keywords. Thus, OR expands your search results. Use OR when you have common synonyms for a keyword. You can surround OR statements with parentheses for best results. To narrow results even further you can combine OR statements with AND statements.

For example, the following search statement locates information on purchasing a used car:

(car OR automobile OR vehicle)  AND  (buy OR purchase)  AND  used

Note: Use AltaVista's Simple Search for implied Boolean (+/-) searches, and use AltaVista's Advanced Search for full Boolean (AND, OR, AND NOT) searches.

Phrase Searching

Surrounding a group of words with double quotes tells the search engine to only retrieve documents in which those words appear side-by-side. Phrase searching is a powerful search technique for significantly narrowing your search results, and it should be used as often as possible.

E.g.    "John F. Kennedy"    "New Zealand"    "global warming"

You can even combine phrase searching with implied Boolean (+/-) or full Boolean (AND, OR, and AND NOT) logic.

Implied:  +"heart disease"    +cause
Full:  "heart disease"   AND  cause

The above example tells the search engine to retrieve pages where the words heart disease appear side-by-side and the word cause appears somewhere else on the page.

 

NOTE ON IMPLIED BOOLEAN LOGIC (+/-): When a phrase search is combined with additional keywords using implied Boolean logic (+/-), you must put a plus or minus sign before the phrase as well as the other keywords. Do not put a space between the plus sign and the word.

+"Blowing in the Wind"   +Dylan

Plural Forms, Capital Letters and Alternate Spellings (Truncation)

Most search engines interpret lower case letters as either upper or lower case. Thus, if you want both upper and lower case occurrences returned, type your keywords in all lower case letters. However, if you want to limit your results to initial capital letters (e.g., "Ernest Rutherford") or all upper case letters, type your keywords that way.

Most search engines interpret singular keywords as singular or plural. If you want plural forms only, make your keywords plural.

A few search engines support "truncation" which allow variations in spelling or word forms. The asterisk (*) symbol tells the search engine to return alternate spellings for a word at the point that the asterisk appears. For example, civil* returns web pages with civil, civilise, civility, and civilisation. Chem* would return web pages with chemical, chemistry, chemically, etc. You should not, however, truncate with less than the first four letters of any word.

Title Searching

Title searching is one of the most effective techniques for narrowing results and getting the most relevant websites listed at the top of the results page. A web page is composed of a number of fields, such as title, domain, host, URL, and link. Searching effectiveness increases as you combine field searches with phrase searches and Boolean logic. For example, if you wanted to find information about George Washington and his wife Martha, you could try the following search:

Implied:  +title:"George Washington"    +President   +Martha
Full:  title:"George Washington"   AND    President   AND   Martha

The above "title search" example instructs the search engine to return web pages where the phrase George Washington appears in the title and the words President and Martha appear somewhere on the page. As with plus and minus, there is no space between the colon (:) and the keyword.

These search sites support Boolean logic:

Alta Vista
www.altavista.com

 

Northern Light
www.northernlight.com

 

Google
www.google.com

 

Excite!
www.excite.com

Go!
www.go.com

Domain Searching

In addition to the title search, other helpful field searching strategies include the domain search, the host search, the link search, and the URL search. The DOMAIN SEARCH allows you to limit results to certain domains such as websites from the United Kingdom (.uk), New Zealand (.nz) educational institutions (.edu), or military sites (.mil) and so forth.

Implied:  +domain:uk   +title:"Queen Elizabeth"
Full:  domain:uk   AND   title:"Queen Elizabeth"

Implied:  +domain:edu   +"lung cancer"    +smok*
Full:  domain:edu   AND  "lung cancer"    AND  smok*


URL Searching

The URL Search limits search results to web pages where the keyword appears in the URL or website address. A URL search can narrow very broad results to web pages devoted to the keyword topic.

Implied:  +url:halloween   +stories
Full:  url:halloween   AND  stories


Link Searching

Use link searching when you want to know what websites are linked to a particular site of interest. For example, if you have a home page and you are wondering if anyone has put a link to your page on their website, use the Link search. For example, if we wanted to find all the sites that have the Actrix main site as a link we would enter the following into the search field:

link:www.actrix.co.nz