Installation of the FREEBIE Package

(Please be sure you have already read the brief text file READ_ME.NOW before going on.)


If you are upgrading an existing installation,
be sure to first read the docfile Upgrading.


Very Exact Instructions:

I assume that you have already done the following, in accordance with the instructions on the site whence you downloaded this package:

a. That you have created, on your own ("local") computer, a new subdirectory; I will assume just for discussion that you named it /FREEBIE. (That is not, for SEO purposes, a good name, but that only matters for what you name the corresponding directory on your server, which we will deal with in a moment.)

b. That you have downloaded the ZIP file and unZIPped it (else you wouldn't be reading these instructions!).

Here's exactly how you go on.


1. Customize the customize.inc File:

Using a plain text editor (not a word processor or HTML editor), edit the file named customize.inc. The file is extensively annotated internally, with each step to take numbered; there are 11 steps--data to customize--and, though each is extensively annotated, I will reproduce the file here with yet further annotations. With any luck at all (meaning if your host-server steup does not have strange quirks), customizing this file will be very nearly all you need to do to install this package.

The file will look long, perhaps dauntingly so, when you first open it--but in fact, almost all that text is explanatory material; all you need to enter are 11 brief data (and you can usually leave several of them alone anyway).


Your Site Identity:

What appears in customize.inc is:

/****** Step 1: Who are you? *****************************************************/
//  Change the datum below to the name of your site as you want it to
//  appear on all individual-edition display pages (where it will be
//  a link to you in large print atop the page).  Try to use as short
//  a title as you can, so it doesn't get folded onto 2 lines.
//  (This datum will also appear at the top of results pages from use
//  of the free-form Amazon search facility.)

//  I advise that you use one of these two forms:

//        The Something or Other Site
//   or
//        mywonderfulsite.com

//  depending on your tastes.  Mind your capitalizations!

//  Also: if the name is quite long--say well over 50 characters--you can
//  force it onto two lines in places like larger-font headlines by putting
//  in, where you want the break, this exact character sequence:

//      <br />

//  The break (if any) will be removed when the name is shown in small fonts.

//  ==============================================================================
//  CAUTION!  If your site name contains any double-quote ( " )
//  marks, you *must* first change the double-quote marks shown
//  below to single-quote marks, as in:

//   EXAMPLE:  $myname='The "Good Times" Site';

//  If your site name contains *both* double-quote marks _and_ one
//  or more single-quote marks (or apostrophes), as in--

//       The "Boys' Night Out" Site

//  --*email* me for the correct form to use below (unless you know
//  PHP and can work it out for yourself).
//  ==============================================================================


      $myname="The SEO Tools, Toys, and Packages Site";

            

Keep firmly in mind that your visitors will see what you enter here all over the place, including in large page headers, so get it right. As it says, I recommend a name of the form I use, The SEO Tools, Toys, and Packages Site, or else, if the bare name is how you like to identify your site, the form seo-toys.com

Let me elaborate about inserting the line split: if you want your site's name, as you give it here, to be split onto two lines in places where it appears in large type, insert the line-break splitter where you want the break. (The browser would likely split the text anyway, but not necessarily where you'd like to see it split!) When the site title is presented in small type, as it is in several places, the package automatically removes the splitter, so it will always be on one line in such places. Example:

      $myname="The Podunk Hollow, Nebraska,<br />Marching and Clam Chowder Society Web Site";

The generated files are all XHTML: be sure you use a line break exactly of the form shown--<br />--or it won't get removed on small-type lines.


Your Search Phrase:

What appears in customize.inc is:

/****** Step 2: What is your search word/phrase? *********************************/

//  Choose the keyword or keywords you will use to search Amazon.

//  YOUR CHOICE IS QUITE IMPORTANT!

//  You want results that are reasonably relevant to your site,
//  but not too narrowly focussed.

//  For example, if your site is about a small town in rural
//  Washington State, you wouldn't want the town name in your
//  keyword phrase--something like 'Pacific Northwest' would
//  work better.  This is a matter of judgement.

//  The provided "bookcount.php" tool will tell you how many
//  books a given keyword phrase will likely find at Amazon.  
//  I suggest using a phrase that gives you a likely book count 
//  of very roughly five thousand, but let common sense be your guide.

//  USE CORRECT INITIAL CAPPING: your phrase will be used for text
//  in the results pages.

//  CAUTION!  Do NOT use any double-quote ( " ) marks in your phrase!
//  You can use multiple words if you want: just separate them with
//  plain blank spaces.  Example:  $keywords="flower gardens";

//  Do NOT use the words "and", "or", or "not" in your phrase: they
//  have special significance (see the docfile "ComplexSearches.html")


      $keywords="Spokane, Washington";

            

For now, I suggest that you leave this datum alone at its default value, because it makes a good, simple "warm-up" test--that is, it will return some results, but only a few (and thus will run in seconds, not hours). We will work at finding the right phrase for you later on in the install.

Do keep in mind--later on, when you come to make your own entries--that your visitors will see this phrase just as you enter it, so use initial capitals judiciously (it's all lower-cased anyway before being sent to Amazon, so the capitalization only affects what your visitors will see).


Your Bookshop-Page File Name:

What appears in customize.inc is:

/****** Step 3: What is your front page named? ***********************************/

//  Chances are you will want to rename the "front page" of your new bookshop;
//  the sample provided is named  book-shop.php  , but you might want to
//  rename it to, for example,  widget-books.php  .  This package needs to
//  know the name you will use, so it can properly set up internal links.

//  In the datum below, replace the name shown with the name you will use:

//  The spec MUST end in  .php  so don't try to fiddle that part!


      $myshop='book-shop.php';

            

You do not have to change this entry, but if you do want to rename the file that is your bookshop's "front door"--perhaps to give it a more SEO-based name, such as history-books.php--put the wanted filename here (the package will automatically rename the file from its default name, which is what the default entry is, book-shop.php, to what you want).

It should go without emphasizing, but I'll emphasize it anyway: as the instructions say, be sure the spec ends with .php--if it doesn't, the rename will not take place, and you'll end up confused over where your shop's front page is.


Your Site Icon:

What appears in customize.inc is:

/****** Step 4: Do you have a site icon file? ************************************/

//  If you have a Windows Icon file that you use in <link rel=
//  statements, place its FULL URL here--otherwise DO NOT change this.
//  A full URL might look like:  http://www.mywonderfulsite.com/IMAGES/favicon.ico


      $myicon='';

            

Be sure to leave the datum blank if you have no "favicon" file.

(If you don't have one, you should; it's a small thing, but helps give your site a really professional look--assuming you make a nice-looking one. If you doubt your ability to create a nice one--and keeping it very simple usually works even for the art-challenged--you can find free specimens here and there around the web.)


Your Domain for Receiving Email:

What appears in customize.inc is:

/****** Step 5: What domain for email to you? ************************************/

//  The spamming-email detector will direct spammers to email you at an address
//  that shows their IP address and the date/time they took the bait; here, you
//  need to specify the domain to which such false addresses should point.

//  Sample address:  24_240_130_222__12_02_05_08_00_08@mywonderfulsite.com

//  Change the domain below to one at which you can receive emails addressed
//  as in the sample above; do NOT include any "www" or other leading part:
//  just use what lies on either side of the last period in the domain URL.


      $domain='seo-toys.com';

            

There is a module in this package--which you don't have to use if you don't want it--that will place a dummy email address in tiny, light-colored text across the very bottom of your shop front page. The address is not a fixed "dummy": it always contains the IP address of whoever, or whatever, is loading the page, plus the exact date and time it was loaded. The dummy appears with a clear warning to real human visitors that it is only a spammer-trap, and should never be clicked on. The point of this is that when (not if) you get emails addressed to such an address, you will know that the senders obtained your email address by automated "email harvesting", which is internet abuse; and you will know exactly when they harvested it and who they are by IP Address which is difficult or (I think) impossible to forge. You can try to hunt down who owns the offending IP address and report their abuse to their host, but that is usually much work to little or no purpose (as most hosts don't give a flying wahoo about their customers' abusive ways). But you can use the information to put Deny From IP-based blocks in place against both the harvester IP (as shown in the email "To:" area) and the IP Address of the email sender. It's not a lot, but it's better than letting these swine go unhampered.


Your Amazon-Affiliate Code/s:

What appears in customize.inc is:

/****** Step 6: Are you an Amazon affiliate? *************************************/

//  IF you are an Amazon "Affiliate" and have an affiliate ID code,
//  enter it here--otherwise DO NOT change this datum!

//  As the instructions point out, if you ARE an associate, you and I 
//  will share sales credits on an average 50-50 basis, with particular 
//  sales being assigned at random (or as random as a php "rand()" gets).

//  Since each national division of Amazon is run separately, enter the
//  appropriate code for each national division at which you have a code.

//  BE CAREFUL!  Enter your Associate Code between the paired single-
//  quotation marks, like so:  $myUSid='owlcroft-20';

//  And, again, leave inapplicable divisions empty AS THEY ARE BELOW.


      $myUSid='';
      $myUKid='';
      $myCAid='';
      $myDEid='';
      $myFRid='';
      $myJPid='';

            

This is largely self-explanatory. This package will round up and list (in separate files, accessed separately) books from all six Amazon national divisions (but always restricted to books in English). Web sites are international: you don't know where your visitors reside. This way, they can buy books from whichever Amazon is most convenient for them. (Also, some books can only be had from particular divisions--the stock is by no means the same across the board.)

(You must enroll as an Associate separately at each Amazon national division, and each issues a unique code usable only at it (using an Amazon US code for an Amazon UK transaction means no commission). If you cannot read German or French or Japanese, use these two facts to help make your way through the enrollment process: first, the steps are essentially identical from division to division, so you can keep two screens open, one on an English-language division, and try to keep in step; and two, the HTML code is always about the same, so you can run your cursor over links and see what they seem to lead to.)


Your ABE/CJ-Affiliate Code/s:

What appears in customize.inc is:

/****** Step 7: Are you an Abebooks affiliate? ***********************************/

//  IF you are an Abebooks "Affiliate" and thus have an affiliate PID code,
//  enter it here--otherwise DO NOT change this datum.

//  If you ARE an Associate, you can get your PID Code value by going to the
//  Commission Junction site, logging in, and then getting the sample HTML
//  code for any ABE ad--any one will contain a pid=nnnnnnn statement.

//  As the instructions point out, if you ARE an associate, you and I 
//  will share sales credits on an average 50-50 basis, with particular 
//  sales being assigned at random (or as random as a php "rand()" gets).

//  (This uses only the U.S.-based ABE interface, though that in fact also
//  includes booksellers in, at the least, Canada and the U.K.)


      $myabe='';

            

That is all self-explanatory. (There is no provision for a UK-based ABE membership, inasmuch as the US one lists overseas sellers anyway.)


Appearance-Related Data:

Steps 8 through 11 have to do with the appearance of the package pages. I strongly suggest that you leave them as they are till you have everything else perfectly in order. The three data are all clearly explained in the customize.inc file.


2. Customize the Auxiliary Files:

There are a few files that you need to set up to reflect your site--they are the files that control whether Google AdSense ads are to appear on your pages, and whether Digitalpoint Co-Operative Ad Network links are to appear. Both are very easy to deal with.


AdSense

There is a file named adsense1.inc; if you are going to run Google AdSense ads on your page, put your Google AdSense code in this file in place of what is there now; be sure to only modify code between the start and end lines of the AdSense code proper:

   between this:
<!-- AdSense -->
   and this:
<!-- /AdSense -->

I suggest that you use the form as given in the file, merely substituting in your own Google Publisher Code for the one shown (which is mine). At least try that out first before making more changes (the code given plugs in a horizontal 4-ad banner colored so that the border and background melt into the page).

If you do not use AdSense at all, delete everything in the file--except you might as well leave the commented Freebie header line in place, the one that looks like this:

<!-- FREEBIE:  adsense1.inc- v. 2.24  -->.

If you need more than one block of AdSense code (different ads on diferent kinds of pages, or more than one kind of ad on a page), copy this file to the name adsense2.inc and put the appropriate code in it; you can have up to 9 different adsense files (and you of course change the calls to match the filenames in the places they are called, which is discussed farther below, under the descriptions of the individual package files).


Co-Op Network

The file coopads.inc is, for those who are members of the Digitalpoint Co-operative Ad Network, an easy way to get your co-op ads onto your bookshop pages. The co-op PHP scripts change from time to time, and so the file coopads.inc may need editing by you from time to time to accomodate those changes (though the co-op ads seem to be quite backward-compatible). If the co-op's PHP script name does change, it will--from past experience-- merely go from the form ad_network_xxx.php to ad_network_yyy.php (where xxx and yyy are three-digit numbers); all you have to do in such a case is to edit coopads.php, by changing the co-op script-file name in the line near the top of the file that looks so:

$networkfile='ad_network_454.php';

But what it is essential that you do--if you do want to run the co-op ads--is to edit coopads.php by simply placing a PHP "comment" marker, a double slash-- // --in front of the line near the top that looks like this--

return;

--so that it looks instead like this:

// return;

If you do not want to run any Digitalpoint network ads, just leave this file alone: it defaults to not running co-op ads.

(And remember to always edit using a simple text editor, not a word processor or HTML editor.)


3. Create the Package Home Directory:

Connect to the server that actually hosts your site on line. (I very much recommend using a straightforward ftp program, not any page-making software packages or the like.)

Pick a name for the package directory; I will use /FREEBIE in these instructions, but I recommend something much like (or identical to) the file name you selected at Datum Step #3 above--something like /history-books.

Next, pick a location for the directory; I recommend right off your site's root, but the location is actually immaterial. Using your ftp software, create the directory.

Finally, again using your ftp software, set the "permissions" of your newly made subdirectory to--depending on how that software displays them--either 777 (or it may show 0777) or rwxrwxrwx.

(I cannot tell you exactly how to do that, because each piece of ftp software is a little different. If you don't know already, poke about in your ftp software's "Files" options--it almost surely has such a setting--looking for things referring to setting or changing "permissions" or "access rights". The package will, if it finds trouble at install time that may be related to permissions, try to change the permissions by itself, but it may not be able to do that--no two servers, it seems, are ever set up in quite the same way. So try to do it manually right off the bat.)


4. Upload the Package Files:

Switch your ftp software into that newly made directory, then upload all of the package files to your new directory. (You don't actually have to upload the doc files, including this one, those files being all the ones with the extension .html--but be wary: there are necessary package files with the extension .htm, which is why it's perhaps simplest to just upload everything.)


5. Make the pre-Install Check:

If you didn't already know this, make a definite note of it right now and forever:

You "run" a PHP script by simply loading it into your browser,
as you would do with any ordinary web-page html (or shtml) file.


In your new package directory, run the script tryme1st.php; in the best of cases, you will get a screen display something like this:


"Freebie" Pre-Install-Check Results:


Here are some key data as this package is reading them.

Please be absolutely, positively sure
that these data are correct:


  • Main URL for this site: http://seo-toys.com/

  • URL of this directory: http://seo-toys.com/seo-books/

  • Path to this directory on server: /usr/www/users/ewalker/seo-toys/seo-books/

  • Site IP Address: 216.92.242.94



Testing HTTP File-Open Capability:

Opened file OK.



Will now seek to write, then read back, a file:

      Test File does not currently exist.

      Test File now exists at size 169 bytes with timestamp February 22 2005 20:26:10.

      Here are the read-back results:

This is Line #1.

This is Line #3.

There are eight lines total.
That is actually more than is needed for this test.
But it's nice to be sure!
{end of file}

      Test File now deleted OK.



Testing completed satisfactorily.

As it says, review the data presented to be absolutely, positively 100% sure that each of the first four data are indeed correct. Your ftp software should show you the path on your host server to your new directory. If you cannot easily find out your IP Address, run the package script phpinfo.php, which will give you a long screenful of data about your site and how PHP works on it; in all that (probably near the bottom), you should find information to determine your host IP Address.



If instead of the "all OK" screen shown above, you get this--


"Freebie" Package: Preliminary Installability Check

"Freebie" Pre-Install-Check Results:


Here are some key data as this package is reading them.

Please be absolutely, positively sure
that these data are correct:


  • Main URL for this site: http://adamscountywa.com/

  • URL of this directory: http://adamscountywa.com/washington-books/

  • Path to this directory on server: /usr/www/users/ewalker/adamscountywa/washington-books/

  • Site IP Address: 209.197.117.61



Testing HTTP File-Open Capability:

      Could not open file! Possibly fopen via URL wrapper is disabled.



Testing completed UNsatisfactorily!

--the problem is very likely to be--almost surely is--that your hosting service has disabled a certain PHP capability in the name of "security". Whether it really does anything for security is debatable, but PHP is, for many uses (including this package) fatally crippled. Most hosts will life the restriction, or (especially if you are cgi-wrapped and have your own php.ini file) tell you how to overcome it. What you need to ask them is:

In my PHP, is allow_url_fopen enabled? And if not, how do I go about getting it enabled?

If they tell you that it already is enabled, you have a problem beyond what a short install manual can advise you on, and you need to speak with your hosting service about it. (You can, if that gets no answers, make contact with me, as described farther on in this document, and I'll try to help you along.)



In the worst of cases, you will instead see this:


Warning! This package will apparently not work for you!

This script has detected that PHP "Safe Mode" is On,
but you do not seem to have cgi wrapping of PHP in place
(the "interface" read was somecode, not "CGI".


If that shows up, you need to have words with your site host: "Safe Mode" ON without CGI-wrapping is impossible to work with, and thoroughly barbaric on any modern-day host server. One thing or the other needs to be changed; the likeliest is wrapping: many hosts can provide you with a cgi-wrap for PHP, and you should, in this case, first ask for that. If your host cannot supply it, ask for "Safe Mode" to be turned OFF for your site. If they can do neither, you really, really need to be with a better host (try Pair Networks--I use them, but I get nothing for referring you, so it's a 100% honest recommendation.)



If you get some "in-between" case--no fatal warning, but either some of the data are incorrect or, worse, the script apparently could not correctly write, read back, then erase a test file--things can get complicated. If you know a little about servers and PHP, you can try to see what might be the matter. Use the package script phpinfo.php to get system and PHP data. If you simply cannot puzzle out what's wrong, email me with what the screen showed and the correct URL to the package directory, and I'll see what I can find out for you.

This package has been greatly simplified "under the hood" since the 1.x versions, but web hosts remain a very various bunch with respect to both server conditions and their ability to explain and alter those conditions. On most good servers, this package ought to work "out of the box"; but if it doesn't, I'm willing to work with you to make it work.

We now proceed on the assumption that either initially or after some efforts on your (and possibly my) part, we are at the stage where you get the "ok" screen, the one shown first here, the one that says Testing completed satisfactorily at the bottom (and you have verified that all the data shown are correct).


6. Do the Actual Install:

Nothing to it: just run the package script finstall.php. You should get a screen that is a long laundry list of tasks accomplished, followed by a note that if there were no overt error warnings, you're installed and ready to run.

The install will have created several subdirectories of the main directory that you created. It will also create one new file: robots.new. That file is intended (after review of it by you!) to replace your existing robots.txt file, which is in your site root directory. If you already had--as you should have--a robots.txt file, the installer will have read it and simply appended a little new material to its contents; if there was no such file, this new one will be it. (In other words, this file should, with renaming, wholly replace your existing robots.txt file.) Note that using this new file is not mandatory--it just keeps searchbots out of the new subdirectories, in which there is nothing for them, but nothing that they shouldn't see, either. It's just a bot time-saver.

(Note that the installer script does not try to determine if your existing robots.txt file already has the needed "disallow" lines in it, so the tentative robots.new file might partly or wholly duplicate what you already have in place; just eyeball it and use your common sense.)


7. Determine Your Final Search Phrase:

The search phrase you use will, of course, determine which books, and how many books, Amazon returns for you to list. Each book adds one more page to your site. It might thus seem at first that the more, the merrier, but that is not the case: there is a saturation limit.

The book titles are listed on 28 pages: 26 for the letters of the alphabet, one for non-alphabetic characters (usually numerals), and one separate page for titles beginning with the search phrase (so that, for example, a search on "Baseball" would not ridiculously overload the B page). Each such page has an "overhead" size--meaning its size with zero book listings included--of about 9600 to 9700 bytes.

At the present time, and possibly for some time to come, the search engines have size limitations in reading pages: Google, it is strongly felt, reads no more than the first 101K bytes of a page. If that is so, and I assume it is, then a titles-list page can include about 91 to 92K bytes of title listing before it reaches a length beyond which Google will discard the rest. An average single book-title listing takes about 750 bytes. Simple arithmetic thus tells us that any titles-list page can thus hold not over about 125 titles--the rest can appear, and may sell some books, but will probably not be seen by search engines as site-page links.

With 28 pages, it might at first seem that we want to list not over 28*125 titles, or about 3,500. But we must also recall that book titles are by no means spread evenly across all 28 pages: typically, the S titles might outnumber the Q titles by as much as 30:1! In other words, we will have to "saturate" several pages--those of the commonest letters--to get much of anything on the sparsest pages.

Where exactly one stops is a matter of judgement, but my own feeling is that about 5,000 titles gives a nice distribution without wildly overloading the more popular pages. But it's all up to you. (Keeping in mind that however many titles you "list", you max out with Google at 3,500 and probably, realistically, at 3,000--at least till they start reading longer pages.)

(But remember that the count of pages added by this package is roughly 12 times the number of titles listed:
6 Amazon divisions times 2 pages per title--1 for new at Amazon and 1 for used at ABE;
it is "roughly" because the titles totals at the 6 divisions will by no means be the same.)

The way you "tune" your search phrase to draw the wanted number of titles is by use of the package script bookcount.php. That script lets you see at once the count of titles that any search phrase you want to try will give you. When you run the script, you will see a screen looking like this (Note! this sample is not itself functional):





Search-Phrase Results:


For the search keyword phrase The Web, Amazon U.S.A. would return exactly 14,812 "raw" titles--but, since most would be unavailable, you would, by rough rule of thumb, probably actually end up listing somewhere from 3,703 to 5,184 books in all, with around 4,444  being the most likely figure.
Word/Phrase to try next:



Amazon division to search in:







The presentation above is just a sample. When you first call the script, it will be showing results for the default phrase in customize.inc and for the Amazon U.S.A. division. You can try phrase after phrase, in any division, till you find one that gets the results you want.

The division you select to test phrases on should depend on where your site is mainly likely to be viewed. Right now, the package uses the same phrase for searching all six divisions; but you should do your searching in the division that will be the one that you will set as the default in your bookshop's "front page" (as explained farther below).

When you have determined a phrase to your liking--and never forget that the phrase itself will be seen by your visitors--change Datum #2 in your customize.inc file and upload the changed file to your server.

(I strongly recommend that at first you just leave the phrase in customize.inc at what it has when you get the package, so as to simplify your initial testing by having a very short run; only when you feel confident that everything seems to be running ok should you put your "real" phrase into customize.inc.)

If you are having trouble finding a search phrase that both matches tolerably well the focus
of your site and yields a reasonable number of titles--not too many, not too few--you might
need to use what Amazon calls a "complex" search phrase.
For more on those, consult the package docfile Complex Searches.

8. Do the Initial Amazon Search:

Load the script doall.php; if you are using the default search phrase provided (Spokane, Washington), the run time should be fairly short; for "real" phrases--ones that produce thousands of titles--the script will run for quite some long time. In all cases, it will be searching out results from all six Amazon divisions at a pace of 10 "raw" titles a second (it cannot go faster according to Amazon's Terms of Service). One division's run could easily take a half hour (usually, but by no means always, the non-U.S. divisions have smaller stocks for a given title search, so usually the Amazon.com U.S. run will take the longest, and it is the first of the six to run).

When the searches are all done (the screen will tell you what's happening), look at the results and see if they look proper to you. Start by loading the page (in your Freebie subdirectory) with the name you provided at Step 3 in customize.inc, and then just go on from there as if you were a visitor to the site. Check several letter pages to be sure you're seeing results (title listings) and that they look correct. Click on a few to be sure you get the corresponding page for each listed title.

If you feel that everything has worked properly, just plug your real search phrase into customize.inc, run the searches again (via the script doall.php), and you're ready to open your shop doors.


9. Setting Up the Links to Your Bookshops:

Notice that that's bookshops--plural. You have six distinct bookshops--one for each of the six Amazon national divisions--each quite different from the others (in content--superficially, they look a lot alike).

The first thing to do is set up links from elsewhere on your site--at the very least, from some spiderable page at or near the top of your page hierarchy (ideally your main index page)--to the "front door" of each of your new bookshops. The name of your "front door" page will be whatever is set at Datum #3 of customize.inc, regardless of division (the division identity is passed to it as a parameter when it's called). And the following point cannot be overemphasized:

Provide a separate link for each of the six national Amazon divisions!

By that, I mean that you should link to your bookshop "front page" in a manner something more or less like this:

<p><u>Look for widgets books in the Amazon national division of your choice:</u></p>
<ul>
<li><a href="http://seo-toys.com/seo-books/web-books.php?in=us">U.S.</a></li>
<li><a href="http://seo-toys.com/seo-books/web-books.php?in=uk">U.K.</a></li>
<li><a href="http://seo-toys.com/seo-books/web-books.php?in=ca">Canada</a></li>
<li><a href="http://seo-toys.com/seo-books/web-books.php?in=de">Germany</a> (English-language titles only)</li>
<li><a href="http://seo-toys.com/seo-books/web-books.php?in=fr">France</a> (English-language titles only)</li>
<li><a href="http://seo-toys.com/seo-books/web-books.php?in=jp">Japan</a> (English-language titles only)</li>
</ul>

--where, instead of book-shop.php you would have the name you gave your front-door file in your customize.inc file (and, of course, you would use the actual domain name and path to the package).

For that HTML, your visitors would see this:

Look for widgets books in the Amazon national division of your choice:

Naturally, you can and should lay that out as best fits your personal style and tastes and your site's "look and feel", but: be sure all six links exist somewhere where searchbots can find them.

If you do not do it that way, and just link to the bookshop page, with or without some one national-ID parameter passed, the search-engine robots will not know of, and thus not index or count, the pages in your five other national bookshops.

(Even though this package makes Google sitemaps for you, there are more search engines in the world than just The Big G, so get those links in place.)

The codes for the six divisions are simple:

  • us = U.S.A.
  • uk = U.K.
  • ca = Canada
  • de = Germany
  • fr = France
  • jp = Japan

(Providing no parameter defaults to Amazon U.S.A.) Remember also that your customers, no matter which division they selected, can--right from the bookshop page itself--change to any other division as and when they please.


10. Setting Up Regular Updating:

It is important for many reasons that you keep your listings pretty well up to date (for one things, search engines like often-changing content). So that you need not devote hours a day to watching screen lines roll up, the script cronf.php can be used to start doall.php as a "daemon" process, meaning it can and does run "unattended" and invisible (cronf.php takes about 6 seconds to run.).

You can manually start cronf.php once every day, but life is simpler if you just schedule it as a cron job on your server. (If you don't know what cron is, or--correspondingly--how to schedule unattended script execution on your server, you can do a Google and ask your host, respectively.) It is not utterly essential that the updates be run every single day, but they should be run at least every couple of days--and if you're going to do that, why not stay fully up to date?

Access to and use of a cron scheduler varies from host to host. Here is some general guidance, but your host is the final source of valid information on using cron on your server.

Your first task will be to find where the Cron interface is. Go to whatever sort of "control center" your host provides for your account and poke about--you're looking for references to either "cron" or a "scheduler" (some hosts hide this under something called "Advanced Options" or the like).

Regardless of the exact nature of the interface, you will have to enter both schedule data (how often and when to run) and command data (what to run). Most ISPs' setups make entering the scheduling information fairly simple; you want the script to run daily, either at some time you set or--as is optional on many hosts--at a time of their daily determination, when their load is slack (a good idea).

I have discovered that the easiest way to run cronf.php is to not try to call it as a server file--which can get needlessly complicated on even the best hosts--but rather to simply call it as a web file. The easiest way, in turn, to do that is to use one or the other of two widely available utilities, wget or curl, as the command you call from cron. Each of those two is a command-line utility used to fetch a file over the internet; we aren't really interested in fetching cronf.php, just in seeing that it is loaded as a web file by something--the "something" here being one of those two utilities.

Your host should have one or the other--or, if your host is any good at all, both--of those utilities in place. All you have to find out--which you do by explicitly asking your host--is how to invoke them from cron. To give you an idea, here is what one of my cron commands looks like:

/usr/local/bin/wget -q -O - http://seo-toys.com/seo-books/cronf.php > /dev/null

There, /usr/local/bin/ is the path, on my host's server, to the directory in which the wget utility is; -q and -O are flags being passed to wget; and > /dev/null tells wget to just dump its output into "the bit bucket" (oblivion, the null device). The exact form you will need you must get from your ISP, but it should look something like that. (curl works in much the same way.)


11. Sitemaps:

Google now provides a way for webmasters to tell it exactly what pages they have on their sites, which is very useful because searchbots are not always so good at finding pages with php-type parameters (the data that come after a ? in a php-script's URL). This package will now automatically generate full sitemap files for all of its pages on every run.

(It will also auto-notify Google whenever those files change, which will probably be at just about every update--daily, if you're following my recommendations.)


Making the Initial Files

There are some things you need to do so that this feature will be of value to you. If you are not already set up with Google for sitemaps, I will tell you how to get set up in a moment. Meanwhile . . . .

The install script will create in your Freebie directory a new file named Sitemap_index.xml; move (or copy) that file to your site's root directory (the installer does not automatically put it there lest it overwrite any prior file of yours coincidentally happening to have that name). That file is the one you will submit to Google as your sitemap for your entire site. It is what is called a sitemap "index" file, meaning it does not list site pages directly, but rather lists other sitemap files that do list actual pages.

This new index file has a single listing for a sitemap of all your site's pages other than those originating in your bookshops. I also provide, as I will explain in a moment, a script that will make such a map for you; but, if you already have a Google sitemap, or a sitemap index file, just plug its name and other data into the Sitemap_index.xml in place of the file there named staticmap.xml. If that is the case, you have now done all that you need to do (since you are already registered with Google, and all that) except to add this new sitemap index file to your existing Google list (and, if you have plugged your existing file's name into this one, you can drop the old Google listing).

If you do not at present have any sitemap, you can use the script makestatic.php to generate one for you. Do not use the makestatic.php script until you have the Sitemap_index.xml file in place in your root directory--because it updates that master index file.

The makestatic.php script is as easy as pie to use: just run it. But, if your site has many tens of thousands or more of pages (other than those the bookshops will be providing), you probably want some other way to make an index file, because this script does not compress the resulting index file, nor take account of Google's per-mapfile limits (no more than 50,000 URLs, no larger than 10MB when uncompressed).

After you run the script, you will find in your site's root directory a file named staticmap.xml. It is important that you manually review and modify this file. Having no way to know better, it assigns every page a "weight" of 0.5 (Google's default) and an average-update frequency of "weekly". (You should consult Google's explanations of the XML data fields to be sure you have straight in your mind what the true significances of such data are and are not.) Go through it and put correct values in for the changefreq and priority parameters for each file (that is tedious, but is a one-time task that you need to do, no matter what, if you want to have Google sitemaps at all).


Verifying Your Sitemaps

It is a wise practice, before submitting sitemap XML pages to Google, to use a third-party verification service to assure that they contain no errors of form. There are several free online verification services. One handy one is the Smart-IT Verifier, at the page linked here.

Be sure to verify each of your sitemaps: the main map-index file, the staticmap map file, and each of the six bookshop map files. (If you find anything wrong in any mapfile generated by this package, please let me know asap--but there should be no problems).


Getting Set Up With Google

If you are not already registered with Google's sitemap-submission program, you obviously need to get registered. You can find the full details on Google's site (as linked here), but the essentials are these:

  1. If you do not already have a general Google "account" (a login and password), you apply for and get one.
  2. You log in to Google's Sitemaps page
  3. You click on the Add button on the page that logging in takes you to.
  4. You select the General Web Sitemap radio button, then click on the Next button.
  5. On the page that takes you to, you enter your site's index URL in this exact form (but, of course, using your site's domain--with or without the leading www. as is your normal practice):

    http://www.mywonderfulsite.com/Sitemap_index.xml

  6. Click on the Add Web Sitemap button.
  7. Google will then show you a complicated file name, something like GOOGLEf73b2ca951e26dd7.html; make a file with that exact name, then upload it to your site's root directory (the file's contents are immaterial--it can be just blank).
  8. Finally, at your main Google Sitemaps page, where your site will now be listed, click on the Verify listed by the site's name, then click again when asked to "verify" your site (Google looks for that uniquely named file to verify that you are indeed the owner of the site, as you prove by uploading that file to it).

That's it. You can, and should, check in periodically at your Google Sitemaps page to see your stats and generally keep an eye on things.


12. Further Customizing:

Appearance

If you look at the last three Data (#9 - #11) of the customize.inc file, you will see that you can, to some extent, further customize the appearance of the title-list pages and the individual-title sell pages; the descriptions in the customize.inc file are sufficiently self-explanatory.

Note that when you make a change in customize.inc, it does not take effect till the title-list dropin pages are re-made. If you don't want to wait for, or do, another search, you can see the results immediately by running the package script pull.php, which just remakes the pages from the last search's raw data. Recall that you need to do that six times--once for each national division you want to update. The exact form of the call is--

pull.php?in=us

--where you substitute, in turn, the six national-division codes where "us" is shown above. (That script runs in a few seconds at most.)

You can go quite a ways further yet in customizing. There are five datafiles with the extension .htm, and each is the essential HTML for one or more of your bookshop's PHP-generated pages. The files, and what they generate, are:

  • amazonmessage.htm: the "Amazon Message" page (amazon-message.php)
  • booksearch.htm: the new-books search-set page (book-search.php)
  • bookshop.htm: the bookshop "front page" (whatever you named it at Datum #3)
  • hold.htm: the 28 title-display pages (holder.php)
  • usedbooks.htm: the used-books search-set page (used-books.php)

The critical key to successfully customizing those pages (besides the usual routine of backup--check--keep multiple backups) is simple:

Do not change or remove any HTML Comment lines!

<!--  That's Lines THATLOOK like this  -->

Some such lines really are just comment lines, put in to help mark out the sections of the pages. But many are "placeholders" that the governing PHP script uses to mark where it should put what calculated values--so those cannot be messed with.

If you have a modicum of HTML knowledge and common sense, you should easily enough be able to figure out what's which, and deal with them accordingly. You can move such lines about in the page, but always keep in mind their environment (don't take a file of table rows and move it outside the table!).

One thing I recommend to give your bookshop the look of a real custom effort is putting in the front-page .htm file a section listing a few books on your topic that you personally and especially recommend. You can see an example of what I mean here.


Modules

This version of Freebie is highly modular, and you can add (or subtract) various modules, including HTML of your own devising. You could, for sheer HTML, include it direct within the .htm pages, but that makes each such page an ever-changing accretion of changes and differences; it is lots simpler to throw in separate modules with a one-line HTML comment.

If you look at the bottom of the package file hold.htm, you will see this:

<!-- KEYSITELINKS : puts in links to other package pages -->

<!-- ROLLYOUROWN : this will insert a module of your design -->

<!-- COOPADS : this will insert a 5-ad mini-table of co-op network ads, with a disclaimer -->

</body>
</html>

Those three HTML comment lines each call in a particular module. If you look at the list of files in this package, you will find, among the others, these:

  • rollyourown.inc
  • coopads.php

(There is nothing corresponding to KEYSITELINKS because that line calls things hard-coded into holder.php, namely links to your new-book and used-book searches and the bookshop front page.)


Digitalpoint Co-Op Ads

The module coopads.php included in this package is for those who belong to the Digitalpoint Co-operative Ad Network and want an easy way to insert the required co-op ads onto the pages of this package. It is fully explained above, at Step 2 ("Customize the Auxiliary Files"), and below at coopads.php under "The Component Files".


"Roll Your Own"

The module rollyourown.inc is included largely as a demonstration, so that you may include some things of your own devising and preference; all it is, is simple, straight HTML. If you don't want what's in it now, add, subtract, or revise to your heart's content.

If you know, or at least are not afraid of, PHP, you can also easily modify the various actual PHP scripts (as listed in the table a little ways up) that handle these modules so that they can include yet more custom modules, of whatever names you like. You take a bit of existing PHP code, such as--

// (user-made)
if (strpos($line,'ROLLYOUROWN')!==FALSE)
{
  $yourown=file('rollyourown.inc');
  foreach ($yourown as $your)
  { emit($your.$crlf); }
  $line=NULL;
}

--copy it in place, and just change change ROLLYOUROWN and rollyourown.inc to other names, for whatever module you want to call in; you replace rollyourown.inc with the name of the new called module, and you insert a line in the HTML of the .htm file that looks like--

<!-- NEWMODULECALL : this will insert a module of your design -->

--where NEWMODULECALL is whatever you use to replace ROLLYOUROWN. All in all, it's easier to do than to explain.


Be sure to read the "Notes" info block below carefully.
It contains information that can enable you to
yet better customize, use, and understand this package.
Think of it as a "Freebie Tips 'N' Tricks" section.





Notes About the FREEBIE Package

The Component Files:

It always helps in using a package to know what ecah file is and does. And a by-file listing is a handy way to throw in more notes on package usage. Here is a brief description of each file--think of this section as a "Freebie Tips 'N' Tricks" doc.

Since any "logical grouping" would be subjective, I present them alphabetically (not subdivided by extension).


abecall.php

This is a "relay" script: it is called with a title and author as parameters, from which it constructs the proper call URL to the ABE Search (through Commission Junction); it then simply does a meta refresh to that URL. The point of this roundabout way of calling is that in the book-title listings, the link to this script is now effectively to another page in your site (for each of the national divisions), just as the Amazon new-book page is, so your added page count is now effectively doubled. (This script does not want or need to be "hidden" in robots.txt, else it would not be counted as a site page.) Though it gives the same results no matter which Amazon national division's listings it is called from, it takes the division code as a parameter, and thus looks like as many distinct pages as times a given title shows up across divisions.


adsense1.inc

This is where you put your Google AdSense HTML code. If you are not an AdSense member, just delete all the code in it.

If you need more than one form of AdSense code--for example, for different color combinations, or different display types-- just make more modules, up to 10 total, each named adsenseX.inc, where X is a numeral from 0 to 9; then, in the .htm file or files where you want these other Codes to go, place reference lines just like the original, but with the different numbers. An example of an existing call line from an .htm file is:

<!-- ADSENSEDROPIN #1 : puts a particular Google AdSense insert here -->

Just change the 1 to whatever number AdSense module you want called there instead. And you can put any number of such module calls in any of the .htm files, which template all the pages a visitor will see except for the numerous individual-title pages. On the individual-title pages--it seemed to me--putting AdSense ads would be inappropriate, as the visitor is there only to see about that one book, and one muddles the "buy this!" message by using other ads; but, if you want to add some there, too, you can add this little block of code to the PHP script free.php at the point where you want the AdSense ads:

$adsense=file('adsense1.php');
spill($adsense);

(Using, of course, whichever adsenseX.inc you want to appear there.)


amazon-logo.gif

This is the small logo that is used where the mandatory Amazon message is called from your bookshop front page. Amazon has others available to Associates, and you can feel free to substitute in whatever other one you may want--just being sure to rename it to the set name of amazon-logo.gif.


amazon-message.php

This is the PHP script that handles the HTML in the associated file amazonmessage.htm. It detects various special-form "comment" lines in the .htm and operates on them to drop in HTML code created on the fly by the PHP script in response to those comment lines.

This general principle of separating a PHP script from an associated .htm file allows those not comfortable with PHP to nonetheless customize their pages freely, since the .htm files are all just plain HTML, albeit with "comment" lines that have special meaning to the associated managing PHP script. You can edit the .htm files to your heart's content; that includes moving the various special-comment lines about within such a file, or even omitting them or adding extra copies. Each special-comment line has the general form:

<!-- SPECIALKEYWORD : explanation of what the keyword calls up -->

The example a few lines up, concerning AdSense Code, demonstrates this.

In general, unless you are quite comfortable editing other people's PHP code, you leave the PHP handler scripts alone and play with the .htm files.


amazonmessage.htm

This is the actual HTML page code that goes with the PHP script just described. As with all the .htm files, you can freely modify it so long as you keep the special-comment lines unchanged (though you can move most about as you like). Do recall, though, that--as it says in this file--Amazon's Terms of Service require that you present the bare message herein. I'd just leave this alone: it's not worth any effort playing with unless you are anal-compulsive about having all your site pages exactly match some format.


attack.php

This is an ancillary tool not actually needed for anything, but included as a little "extra". It is intended for verifying that the spammer-throttle (described later) is working as you expect: it "attacks" the Amazon-message page (arbitrarily chosen by me for this test) by requesting it every N seconds, where you can set N to be anything from 1 to 10. You set that parameter by how you call the script; a call of--

attack.php?every=6

--will generate a request every 6 seconds. When you are happy with the settings in the timer.inc script, use this script to verify that request rates slower than your cutoff can go on forever, while those faster will eventually be throttled (how quick "eventually" is depends on how much faster than your threshhold the requests come in).


avails.inc

This is a datafile that the package uses to try to translate non-English "Availability" and "Media" information into English. It is a separate module so that it can easily be updated as Amazon whimsically changes the text they use to denote certain things. A real business would use set codes that users would then translate into their local language, but Amazon is, as we all know, not a real business--so we have to try to accomodate to their bizarre whims.

I am not an expert linguist: I made these by research and comparison (the obvious course of using a dictionary or translation service yields poor results, as many of these words, especially for "Media", appear to be technical terms of art). If you can improve or augment this module, feel free (but I ask, as a favor, that you email me with any such information, so I can share it in the next package version).


book-search.php

This is the PHP script that "handles" the associated HTML file named booksearch.htm, which is the general new-book-search page. The remarks above cover such arrangements.


bookcount.php

This is the PHP script that allows you to find title counts for any search phrase in any Amazon national division. Called with no parameters, it defaults to the search term set in your customize.inc file and the U.S. division. Since it's so easy to change data while using it, there is no need to try fancy calls to it.


booksearch.htm

This is the HTML handled by the PHP script book-search.php, cited above.


bookshop.htm

This is the HTML handled by the PHP script book-shop.php, cited just below.


book-shop.php

This is the PHP script that "handles" the associated HTML file named bookshop.htm, which is your bookshop's "front page". The remarks farther above cover such arrangements.

Please note this: the package installer script will rename this file to whatever name you supply as Datum #3 ("What is your front page named?") in the customize.inc file. The point here is to allow you to call your shop's "front page" by whatever SEO-worthy name you might please without requiring you to remember to make the duplicate. Just don't get confused wondering where this file went to.

The script automatically, and randomly (with a different result at each call) selects a "sample" book listing to display--you don't need to select and encode one (but you can, manually in the bookshop.htm file, if you like).


coopads.inc

This PHP-include file works with coopads.php (described just below); they are for the convenience of those who are members of the Digitalpoint Co-operative Ad Network. This script is called by coopads.php, and determines just two things: whether or not Co-op ads are to be run; and, if they are, the name of the Co-op-supplied PHP script to use for getting those ads.

This module defaults to being NON-operative! If you are a co-op member, you must edit this file by placing the PHP "comment" mark--a leading // --in front of the instruction near the top, so that the line originally--

return;

--becomes instead--

// return;

(The file itself contains a short form of this note, for your convenience.)

For the module to work, the Digitalpoint-provided PHP script ad_network_454.php must be in the package directory or any directory above it--up to 9 levels up--and that script must be able to find its datafile ad_network_ads.txt (it's usually put in the same directory as ad_network_nnn.php). Incidentally, these are server-file-system directories, not site-based directories, so if you have several sites off one primary server directory, you don't need to have multiple copies of the Digitalpoint files.

It is solely your responsibility to obtain and install whatever may be the latest Co-op ad_network_nnn.php file,
where nnn changes from time to time with upgrades, as the Co-op Network management feels necessary.

To upgrade, after obtaining and installing the latest co-op file,
just modify the file-name spec in this script (coopads.php).



coopads.php

This module is for the convenience of those who are members of the Digitalpoint Co-operative Ad Network. Depending on how coopads.inc (described just above) is set, this module will or will not display co-op network reciprocal ads on the Freebie-package bookshop pages; if that file is set to display, this file will cause displays--as required by the network--on all pages.

The module as supplied will insert 5 text-only ads in a small table with a pale yellow background and disclaimer comments. (It also modifies the supplied ads so that they are valid XHTML code, which they are not--at present--as supplied.) You can rather easily modify the module because the salient parts are basic HTML, just with each line included in some PHP boilerplate; that is--:

emit('<font size="1">Here are links to some other sites you might find interesting or useful:'.$crlf);

is equivalent to plain HTML that reads--

<font size="1">Here are links to some other sites you might find interesting or useful:

--so you can change this around to suit your tastes and needs.


cronf.php

This script is callable like any other PHP script (via browser, that is), and, if so called, will run for 5 or 6 seconds, then quit--during which time it has started, as a silent, "daemon" process, the entire updating process of searching each Amazon division for all titles relevant to your search phrase. So you can start this guy, then go about your life while, on your server, the scripts are doing their long, tedious work.

But the real purpose of this script is to allow you to start your update job--which I strongly recommend you do once a day--as a cron job: you set cron to run this script daily, and that's all she wrote.

(cron is a scheduling utility that runs on all *nix-based servers, and possibly others; it allows you to flexibly schedule unattended runs of scripts and commands. If you don't know how to use cron on your server, enquire of your host.)

customize.BAK

This is just a copy of customize.inc as initially supplied in this package, so that if you ever somehow screw up customize.inc so bad that you don't know where you are anymore, you can refer to this copy and start your customizing anew. (So, obviously, never modify this file itself!)


customize.inc

This file controls your customization of this package, and it is explained at vast length at the start of this docfile.


doall.php

This PHP script simply calls dofind.php (explained below) for, in turn, each of the six national Amazon divisions. It is the script that is started by cronf.php (cited above).


dofind.php

This PHP script, which requires the PHP parameter in set to the desired Amazon national division--as in...

dofind.php?in=ca

...to work, runs the actual search workhorse script findbooks.php (described further below) as a "slave" process; that is done so that if findbooks.php ever "dies" before completion (as can happen on resource-limited servers), the search process can pick up from where it left off, rather than begin afresh (which could be a disastrous waste of time, and might never finish).


findbooks.php

This PHP script, which requires the PHP parameter in set to the desired Amazon national division--as in...

findbooks.php?in=ca

...to work, is the actual search workhorse; it searches the specified Amazon division for all books returned by Amazon in response to the search phrase you set in customize.inc, and lists them as HTML Table "drop-ins" in 28 files placed in the /bookpages subdirectory of your main package directory, each named along the lines of dropins-B.ca.


finstall.php

This is, yes, the package installer. Here are the things it does:

  • It makes--if they do not yet exist--the seven subdirectories needed in the package main directory.

  • It permissions--or tries to--all the package files to 0777 (rwxrwxrwx).

  • It makes duplicates of all the package files in a dedicated archival subdirectory.

  • If there are no extant backups of the six customizeable files, it creates a set.

  • It renames book-shop.php to whatever name you set forth in customize.inc.

  • It reads your current robots.txt file and generate a revised version, named robots.new, that you can substitute for your current version.

  • It makes a master Sitemaps_index.xml file.

  • And it will tell you whether PHP's "Safe Mode" is On of Off on your server, and what "Server Interface" PHP is detecting (typically either CGI or Apache module).


free.php

This is the individual-edition reader script. It needs two data to work: the Amazon "ASIN" code (for real books, that's the ISBN with no hyphens) and the identity code of the Amazon national division. That would look like:

free.php?in=ca&asin=0123456789

You rarely if ever need to know that: this script is what is linked by the listings in the 28 title-list pages.

It comes without any Google AdSense code, as described some ways up in this file, but--as it notes there--you can manually re-code the script to include it.


freebie.inc

This is a workhorse "include" file that has data, functions, and various calculations in it. It is included into every PHP script in the package, and is essential for all.

This script auto-detects where the package is in your file structure, and supplies that information to all the package scripts.


freepasses.inc

This little file is read by the timer.inc script, described farther below; it is a place where you can list any IP addresses (in number form) to which you want to give a "free pass"--that is, allow to hit your server more often than the parameters in timer.inc allow (you might, for example, allow outsiders one hit every 5 seconds maximum, but allow some other site of your own a hit every second).

Note that the site in which the package is installed always has a free pass, so you don't have to put that in; this is only if you have other sites that might be checking your files on this one. You should have few concerns about search-engine robots, as most--including the Big Three--will honor a Crawl-delay directive in your robots.txt file, and any bot that ignores it deserves what it gets.

There is more information on those matters farther on in this document.


got.php

This script simply examines the package working files and reports how many titles were found in the most recent search for the specified Amazon division. It requires the appropriate division as a parameter:

got.php?in=ca

Normally you will have no need of this script, which is just another handy little "extra" tool.


hold.htm

This is the HTML handled by the PHP script holder.php, cited just below.


holder.php

This is the PHP script that "handles" the associated HTML file named hold.htm, which is the general new-book-search page. The remarks above cover such arrangements.

This script generates the 28 title-listing pages. It requires some parameters--

holder.php?in=ca<r=B

would specify the letter-B titles found for Amazon Canada.


Install.html

You're reading it.


know.inc

This module, which you can delete, adds a little more change to each page by asking, in small print at the bottom of the page, "What do you know about X?", where "X" is one of nearly a million topics in an online encyclopedia; the "X" is a link to the encyclopedia article on the corresponding subject.

I run the encyclopedia site, and would appreciate it if you leave this in, but you certainly can remove it if you feel the need. Why not try it out and see how it looks?


makemail.inc

This is, to me, a very pleasing little module. It generates the following line, which is presented in very small, faint type at the bottom of the bookshop "front page":

This is a spammer-trapping spurious email address; please do not click on it!.

The email link concealed under that message has the general form:

24_240_130_222__12_02_05_08_00_08@mywonderfulsite.com

In that, the first four blocks--here shown as 24_240_130_222--are the IP Address of whoever or whatever has just loaded the page, while the rest (here as 12_02_05_08_00_08) is the date and time the page was loaded, in format mm_dd_yy_hh_mm_ss; and the email domain is as you set it in your customize.inc file.

The idea is that when you get spam emails addressed to that sort of address, you know for a dead-certain fact that they were obtained by a spamming email harvester (although the spammer-throttle described farther below will hugely cut down such harvesting, it cannot stop it altogether)--and moreover, you know the IP Address of the harvester and the exact instant it harvested that email address.

You can do several things with that knowledge. The most useful is to use your .htaccess file to put up permanent "Deny From" IP-based blocks against the harvester. You can also put a block in your email against both the harvester's address and the emailer's IP address (which are rarely if ever the same); that has a microscopic possibility of blocking a few more or less legitimate emailers who doesn't realize where the lists they buy come from, but blocking such as they seems little loss. Finally, if you're a stubborn cuss with time on your hands, you can use WHOIS services to track down the host for the offending IP Address and send them an abuse email explaining how and why you absolutely, positively know that that IP Address was harvesting, and when they harvested. Frankly, most hosts don't give a bowel movement in a hat about curbing abuse (especially the hosts that abusers like to rely on), but if it feels good, do it.


makemaps.php

This is a standalone script that does what findbooks.php automatically does every time it runs: it generates six sitemap files--one for each Amazon division--listing all your per-title book files. It also auto-updates the master Sitemap_index.xml file with the date/time of each updated sitemap file.


makestatic.php

This script will list all files in your site, except those within the Freebie directory and its child subdirectories; it will, however, honor all exceptions it can find in your robots.txt file. Note that in using robots.txt, it does not discriminate by "User-agent": it takes every Disallow: statement as a block. Thus, if you have directories or files blocked to some searchbots but open to Google, you will need to manually add those into the resultant staticmap.xml file that this script generates in your root directory. (It auto-updates the master Sitemap_index.xml file whenever it is run (which will normally be only once--but you can re-run it whenever your list of site pages outside the Freebie ones has been augmented). Note that this script automatically creates entries for all of the "fixed" pages associated with your multiple-bookshop Freebie package--the six "bookshop front-door" pages, the search pages, and so on; the makemaps.php script, and its built-in equivalent in the daily findbooks.php script, just map the individual-title book pages.


nonbooks.inc

A list used to determine, during a free-form search, what sorts of "Media" will be excluded from the results if the user has selected the "Real Books Only" option. Putting these in a distinct include module makes it easy to update or modify them.

(Note that the list is almost certainly incomplete as to non-English-language terms; if you can augment it, please let me know, so I can pass it on in the next version upgrade.)


phpinfo.php

Another little "extra": all this does when called is return a lengthy display of parameters relating to both the PHP setup on your server and also some general server information (as PHP sees it). This is so handy, I myself put it in the root directory of all my sites and use an .htaccess redirect so that just calling http://mywonderfulsite.com/info calls it.


pull.php

The findbooks.php script has a built-in failsfae for those days when Amazon is having a bellyache: it will not replace the previous day's search results if the current ones are not at least 80% as many titles. But once in a while--for example, if you decide to change your search term--a legitimate search may trip the failsafe. This script forces a use of the extant search results, even if they are less than 80% of the last ones. You call it with just the usual divisions ID parameter:

pull.php?in=ca

Itt will report what exactly it is doing. Note that this does not run a new search: it just uses the data gathered during the most recent one.


putback.php

This is a tool you should rarely if ever need. It takes the backed-up copies of the six key customizeable files (customize.inc, amazonmessage.htm, booksearch.htm, bookshop.htm, hold.htm, and usedbooks.htm) and replaces whatever is extant with the backups. This is for cases in which you have somehow screwed up your current versions and want to quickly replace them with the older backups. (If you're not in a hurry, you can just download the backups and manually play with them.)


READ_ME.NOW

You presumably already did.


rollyourown.inc

This module is mildly useful, but is chiefly a demonstration of how easy it is to add in material of any sort to the package pages. Its contents are always just plain, straight HTML, and the HTML-comment-caller in the appropriate .htm file calls it and displays the HTML.


search.php

This is the script that does the search work called out in book-search.php (and its associated HTML file booksearch.htm). It takes a zillion specialized parameters, and should not be played with.


sitemaps.inc

This is the sitemap-making code used by findbooks.php, broken out as a PHP-includable module to make future updating simpler; it is very similar to the code in the standalone script makemaps.php (described above).


timer.inc

This is the spammer-throttle referred to several times above. There is extensive descriptive text in it, but the essence of it is this: you select a maximum average "hit rate" at which you will allow any one visitor to take files, and a time period over which the averaging should take place. Any visitor--human or robot--that takes files at or more slowly than the rate you have specified will be totally unaffected; visitors that take files faster than that will accumulate "penalty points", so to speak; when those points get too high, they will be blocked for some period (which you select), and will, during that period, receive a 503 "Service Temporarily Unavailable" message and HTTP Code. (Further attempts during the blocked period will extend their penalty time yet more.)

For background, Google has stated that it will never take files more often than every six seconds, and furthermore they and most other search engines (including all three major players) honor the robots.txt directive--

Crawl-delay: 5

--where 5 (or whatever you specify in your robots.txt file) is the minimum number of seconds allowed between hits. Hit rates much above once every 5 seconds can, as the cliché goes, bring a server to its knees, effectively amounting to a "denial of service" attack.

The module comes set with a minimum allowed average hit pace of once every 5 seconds, as averaged over a moving 60-second window; the penalty time is 10 minutes (which, believe me, is not severe enough--I still get spammers with dozens of log entries, meaning they took the penalty and came back dozens of times).

Speaking of logs: the module logs all denials with the IP Address, the "user agent" as given (very unreliable, as most spammers spoof someone or something else), and the exact date and time of the denial. The default log name is ErrantIPs.Log, but you can change that in the module.

Let me clarify a few points. First, the rate is calculated on a moving-windoww basis, so a visitor who takes a few files in rapid succession does not automatically incur a penalty--visitors need to take more than X files in the most recent window period. Second, throttling is not instantaneous: a visitor who is exceeding your specified rate by only a little (say a file every four seconds instead of every five) will have to go on like that for a while before being throttled, whereas one who is grossly exceeding the "speed limit" (say a file every second, or even faster) will get throttled very quickly, in barely over the window-size period (say, a minute).

This module, whose basic idea and implementation is not mine, seems to me like the greatest thing since the peelable banana. No more trying to block based on user agent, which is usually spoofed by spammers, or by particular IP Address, which spammers change more often than their underwear: now you control access based on rapid real-time evaluation of bad behavior, period the end.


tryme1st.php

This is a one-time-use auxiliary script intended to verify that the package should not encounter and insurmountable problems being installed on your host. Its use is described near the top of this docfile, but to summarize what it does, it:

  • checks that you do not seem to have the fatal combination of PHP's "Safe Mode" being On and not a CGI interface (that combination, found only on a few quite ill-managed hosts, makes almost all PHP operation impossible);

  • determines from PHP, and displays for your verification, a few key data, mainly concerning where on your server the package resides;

  • tests whether you can use HTTP to open a remote file;

  • attempts to write, read back, then erase a small test file, just to be sure that the package has the necessary file-access rights on your server;

  • and, if for some reason the write-read-erase test fails, the script attempts to repermission the main package directory, then repeat the test. It reports its results.


unavails.inc

Analogous to nonbooks.inc, described farther above, this module is a list used to determine, during both regular and free-form searches, what sorts of "Availability" descriptive terms will be excluded from the results as "unavailable" (in free-form searches, the user can elect to bypass this filter and see all listed items, available or not). Putting these in a distinct include module makes it easy to update or modify them.

(Note that the list is possibly incomplete as to non-English-language terms; if you can augment it, please let me know, so I can pass it on in the next version upgrade.)


Upgrading.html

This is a docfile with just a few obvious notes on how to proceed if you are upgrading from an earlier verion of this package. The jump from 1.x to 2.x versions is so large that it was not possible to conveniently automate the upgrading, but it is still not a big deal: most of what you will have done in manual customization can be easily carried across to the corresponding files in the new version.


used-books.php

This is the PHP script that "handles" the associated HTML file named used-books.htm, which is the general used-book-search page. The remarks above cover such arrangements. Note that, as ABE is international, there is no associated "in" parameter.


usedbooks.htm

This is the HTML handled by the PHP script used-books.php, cited above.


watch.php

A little extra, useful beyond the purposes of this package, watch.php calls an operating-system utility that reports data on running processes and displays it for you every so many seconds. The data it reports are described as:

  • USER: The owner of the process, typically the user who started it
  • PID: The process' unique ID number. These are assigned sequentially as processes start; when they reach 30,000 or so, the numbers start over again from zero, though processes 0-5 are usually low-level operating system processes which never exit.
  • %CPU: Percentage of the CPU's time spent running this process.
  • %MEM: Percentage of total memory in use by this process
  • VSZ: Total virtual memory size, in 1K blocks.
  • RSS: Real Set Size, the actual amount of physical memory allocated to this process.
  • TTY: Terminal associated with this process. A ? indicates the process is not connected to a terminal.
  • STAT: Process state codes. Common states are S = Sleeping, R = Runnable (on run queue), N = Low priority task, Z = Zombie process
  • START: When the process was started, in hours and minutes, or a day if the process has been running for a while.
  • TIME: CPU time used by process since it started.
  • COMMAND: The command name. This can be modified by processes as they run, so don't rely on it abolutely.

The script's default report period is every 5 seconds, but you can call it with a parameter to set the value to whatever you want. For example--

watch.php?every=10

--would call a new display every 10 seconds. It will continue to run till you stop it.

There is a second parameter you can use, limit=y; that will restrict it to reporting processes for which php appears in the process data line. You would use--

watch.php?limit=y

--or, to combine the two--

watch.php?limit=y&every=10

Since this script uses a *nix system call, it will not work on Windows-based servers..

It is a reasonably handy way to see, among other things, how much CPU capacity a process is using. (On servers where PHP is cgi-wrapped, it is hard to determine at a glance which php process is what script, since they are all listed as the wrapper; you have to use some logical deduction.)


--==ooOOoo==--