Doing Amazon "Complex Searches" With the FREEBIE Package

This docfile assumes that you are already fully up to speed on all the material in the main package docfile Install;
if you are not, be very sure to read that docfile thoroughly before trying to make sense of this docfile.

Regular Versus "Complex" Searches:

An ordinary, or "simple" search of Amazon is the kind the Freebie package is set up to normally use. It consists in a single phrase (a "phrase" is one or more words), which phrase is used as the content of the "keywords" field in an Amazon "Power Search".

How Amazon finds matches to such an inquiry is entirely up to Amazon, but experience will show that the book titles returned are usually on point, but are sometimes rather wildly--and incomprehensibly--far off topic. Moreover, it may be that no single search phrase you can contrive will return anything like the number of titles you would ideally like to get: you may be getting too few or too many, no matter what you try.

(For your convenience, I reproduce here the portion of the main package docfile that deals with what is a good number of titles to get back.)


The search phrase you use will, of course, determine which books, and how many books, Amazon returns for you to list. Each book adds one more page to your site. It might thus seem at first that the more, the merrier, but that is not the case: there is a saturation limit.

The book titles are listed on 28 pages: 26 for the letters of the alphabet, one for non-alphabetic characters (usually numerals), and one separate page for titles beginning with the search phrase (so that, for example, a search on "Baseball" would not ridiculously overload the B page). Each such page has an "overhead" size--meaning its size with zero book listings included--of about 9600 to 9700 bytes.

At the present time, and possibly for some time to come, the search engines have size limitations in reading pages: Google, it is strongly felt, reads no more than the first 101K bytes of a page. If that is so, and I assume it is, then a titles-list page can include about 91 to 92K bytes of title listing before it reaches a length beyond which Google will discard the rest. An average single book-title listing takes about 750 bytes. Simple arithmetic thus tells us that any titles-list page can thus hold not over about 125 titles--the rest can appear, and may sell some books, but will probably not be seen by search engines as site-page links.

With 28 pages, it might at first seem that we want to list not over 28*125 titles, or about 3,500. But we must also recall that book titles are by no means spread evenly across all 28 pages: typically, the S titles might outnumber the Q titles by as much as 30:1! In other words, we will have to "saturate" several pages--those of the commonest letters--to get much of anything on the sparsest pages.

Where exactly one stops is a matter of judgement, but my own feeling is that about 5,000 titles gives a nice distribution without wildly overloading the more popular pages. But it's all up to you. (Keeping in mind that however many titles you "list", you max out with Google at 3,500 and probably, realistically, at 3,000--at least till they start reading longer pages.)

(But remember that the count of pages added by this package is roughly 12 times the number of titles listed:
6 Amazon divisions times 2 pages per title--1 for new at Amazon and 1 for used at ABE;
it is "roughly" because the titles totals at the 6 divisions will by no means be the same.)

A "complex" search also uses its phrase as the "keywords" field of a Power Search, but it takes advantage of so-called "Boolean" combinations of terms. A Boolean search is one in which you essentially perform multiple searches and cross-correlate their several results to get selection, or "filtering", not possible with a simple single search. The way in which the multiple searches are defined, and how their results are cross-compared, depends on a few simple rules. Those rules involve, among other things, the special words and, or, and not, which is why you are carefully cautioned not to use any of those words in a normal "simple" search phrase.


Creating a "Complex" Search:

Boolean Searching

"Boolean" searches (named after logician George Boole) let you use use the words and, or, and not--called "logical operators"--plus quotation marks and parentheses to construct more-exact queries.

Note: You don't have to upper-case the Boolean operators (and, or, not) in your "complex" search phrases--in the discussions below, I do it just for emphasis.


Joint Requirements

The magic word AND signifies that any result returned must independently match both of the terms it is placed between. (To start simple, to begin with we'll consider independent terms that are each just a single word.) Suppose you wanted to get back books that deal with, say, "landscape painting"; if you use just landscape painting as a simple phrase, you may either get back too many titles, or you may (indeed, will) find that you are getting books many of which are about "painting" and many others about "landscape" but not so many about actual "landscape painting". In that case, you could make a "complex" search phrase like this: landscape AND painting. With that, the Amazon search mechanism "knows" that to qualify as a search hit, a book has to in effect satisfy two independent searches: one for "landscape" and one for "painting"; only titles that would be on both of those separate lists will be returned.


Independent Requirements

The next magic word is OR; with this word, you tell Amazon that you want results from either of two (or more) effectively separate searches. Suppose you want to find titles related to any one of the fields of science "astronomy", "cosmology", or "astrphysics"; you would construct a "complex" search phrase that is just astronomy OR cosmology OR astrophysics and you would get back titles matching any of those separate searches. (In may ways, that is similar to a simple search just listing those three words; the real power of OR comes in assembling more complicated searches, as we will see in a moment.)


Omission Requirements

Now suppose your original search phrase is getting mostly what you want, but also including some stuff you'd prefer to leave out. The magic word here is NOT, which omits what it applies to. For example, you want books about "pets" but want to exclude any about "cats"; you would use as your "complex" phrase pets NOT cats.


Unit Phrases

Those are the "magic words", but there are a couple of other things that help you in framing "complex" queries. One (which also applies in "simple" searches") is the use of quotation marks to mark some words as an "exact phrase", meaning that they are to be treated as, in effect, a single term. For example, suppose you wanted to find books about "evolution" but not include any dealing with "intelligent design"; you would make your phrase like this: evolution NOT "intelligent design". If you didn't use the quotation marks, your search would exclude books returned by a search on "intelligent" and books returned by a search on "design"--not at all what you wanted to accomplish.


Logical Requirement Grouping

The last helper is parentheses: these help when you have especially complicated queries. They tell Amazon's search to treat what is inside a pair of parentheses as an operation that is to be performed first, and its results then used as the rest of the search phrase dictates. An example will make that clearer: suppose you wanted books about paintings of dogs or paintings of cats, but not paintings of anything else. You could use this form: painting AND (dogs OR cats). That tells the Amazon search facility to first compile a list of titles that match "dogs" then a list that matches "cats", and to combine them; then make a list that matches painting and include everything on it that also matches the result you got before. Got it? And you can extend the complexity to your heart's desite by using "nested" parentheses (sets inside other sets).

To help clarify that last example: suppose instead you used just painting AND dogs OR cats--what would happen? Amazon would search for books matching "painting" and also matching "dogs", then search for all books about "cats", then combine the two lists, the painting/dogs list and the cats list. You would end up with many books about cats that have nothing to do with painting. See?

Or you could get the same effect as painting AND (dogs OR cats) gives by using instead (painting AND dogs) OR (painting AND cats): the search would internally list books matching both painting and dogs, then internally list books matching both painting and cats, then return to you the combination of those two lists. Generally, you should try to select between alternative castings that will work equally well on the basis of which involves less overall work by Amazon's search engine; in this example, the first form is better, because Amazon only has to do three searches ("painting", "dogs", and "cats"), whereas in the second form it has to search "painting" twice (once to match with dogs and again to match with cats). This is not a critical point--if you can only think a search out in one way that makes sense to you, just do it and don't worry about it.


Watch Your Pages!

In several places in your bookshops, the search phrase you use is explicitly displayed. For "simple" search phrase, that is fine: "Books on Astronomy" is a fine description.

But if you use a "complex" phrase, it won't look very pretty if displayed in text or as a section title. You thus--if you do find you need to use a "complex" search phrase--need to go into a few files and change their "auto-display" to some hand-edited (by you) form that expresses in a "nice" (simple and clean) phrase what sorts of books your shops are carrying. (As always with these files, do any editing with a real text editor, not a word processor or "HTML editor".)

In particular, here is where you need to intervene:

In File: Replace: With:
hold.htm ~~~ (three tildes) your description
hold.htm @@@ your description
bookshop.htm Widgets or widgets your description

Exactly what description you use to sum in two or three "nice" words what your complex search is finding is up to you, but do remember that it is all your visitors will see to expalin what sorts of books your bookshops are carrying; don't assume that they'll automatically "know" that the books are relevant to your site's theme.

That will, in some complicated cases, take some ingenuity--but it's worth a few moments of chin in hand to get a nice result. To take a simple example to illustrate the general idea, if your actual "complex" search is set for history AND ("Washington State" OR Oregon OR Idaho), you might use "Pacific Northwest History" as the descriptive phrase your visitors will see. Anyway, think about it.

===ooOOoo===