The Spam War

By far the biggest problem with using the Internet is the massive amounts of spam that is clogging up the works. The most common is the spam you get in your email inbox, but increasingly there are tons of junk sites with nothing but useless content. If you run a forum or blog, you have to constantly monitor things due to the comment spam entries you get daily. Of late, I decided to start fighting back. Though I am not sure how effective the FTC is these days, I started to forward any and all spam emails I get to any of my email addresses.

If you get spam email that you think is deceptive, forward it to spam@uce.gov. The FTC uses the spam stored in this database to pursue law enforcement actions against people who send deceptive email.

I found out that Mac OS X has a really handy feature, unique to the 10.4 release. Regardless of the program, if you hold the pointer over a URL link, the operating system shows you a tooltip window with the actual URL that the link is redirecting too. I am not sure this would work for shorturl redirection urls, but it does help quite a bit with many phishing emails.

Peastat Stats

With all the talk of Google Analytics and Mint, it was easy to forget about about Tom Dyson’s tiny Peastat. If you need quick stats without a lot of graphics and without having to login to some other site, then try looking at Peastat: A small Python based script, it gives you plenty of information and it does it quickly.

Peastat is divided into four sections. The top section is a quick summary, while the next three sections are what most webmasters would be interested in. The Recent Popular Pages, tells you what pages are the most popular right now, the Recent Refferrers clues you in on what sites are sending traffic your way, and last, the Recent Popular Search Terms give you the all important key words that vistors are using to find your site.

Learn more about Peastat on WebKeyDesign’s Forum.

Blog Comment Spam

Last year, Charles Arthur interviewed a link spammer for The Register, detailing why and how, link spammers target weblogs. The entire story is interesting in that the spammer describes how most sophisticated comment spam is actually done through proxy machines so that the spammer is not penalized by his own isp. However, the story of link spammers is no different than most other stories on the Internet. It all eventually leads to the idea that the ethics of committing such actions are very debatable, but that due to lack of consequences, the strategy continues to be very popular. This reminds me of something I read a long time ago about how locks on doors are not really for thieves, but for everyone else; meaning that a thief will break your window to get inside and rob you, while a normal person will only be tempted if you leave your doors unlocked. This is the same reasoning for spammers. If they can make money by spamming you, without being penalized, they will. In the business world marketing is everything and the temptation to view link spamming (or any type of spamming) as legitimate marketing is so great that many spammers do in fact view themselves as any number of labels, such as: affiliate, search engine optimiser, advertising consultant, or simply marketing.

The Webmaster’s Spam Problem

Regardless of what the ethics are, Comment Spam, is in the end the webmaster’s problem and no one else’s. This means that just as the link spammer invests lots of time in tweaking his arsenal of spam tools, the webmaster must invest in some counter initiatives to protect his site and content. For WordPress users, the WordPress Codex has lots of helpful information on combating comment spam, but while browsing a blog myself, I noticed a completely different strategy. The webmaster for a blog had added a warning to the bottom of her blog that warned spammers that any comment spam would result in their domain being reported to Google as a spammer.

Google actually maintains a Report Spam page, in which you are encouraged to report domains that are using deceptive practices to achieve higher search engine rankings. I am not sure if Google would accept link spammers and the domains they market as actual spam domains, but ultimately it is up to Google to decide this. I will note that Google is known to check the sites of the person who submitted the complaint as well, so Google does not take these complaints at face value.

A while back Google worked with MSNSearch and Yahoo!, to implement the nofollow attribute for links, which was suppose to prevent blog comment spam by giving no emphasis or weight to comment links, but obviously this has not totally deterred link spammers. The rel=”nofollow” attribute is implemented in WordPress 1.5 and other major blog scripts, yet I still receive many comment spam entries weekly.

It is evident that dealing with spam is going to be an ongoing task for webmasters, and that this is just one of many problems we have to deal with as the price of having a space on the Internet.

Installing Scripts For Your Site

You just downloaded a script program from HotScripts.com that you think will give your web site some very much needed features or maybe you decided that a complete content management system is what you need to base your site on, and now it is time to install that script or program. Except that all you keep getting are error pages and you cannot get your site working now!

Script installation can quickly become a nightmare if you do not know what you are doing or if you missed a step somewhere in the process. This is why programmers write installers for their scripts, but even a good installer cannot detect everything that could go wrong.

Here are some good tips to keep in mind when installing a script on your site:

Demo Your Script:

Although not always possible, most scripts that are popular have a demo site or demo area that show you what they do. This is the best way of seeing if a script is right for your site. A good site to try out most web based cms and blog scripts is OpenSourceCMS.com, who has demos and user comments on many open source scripts.

Research The Requirements:

All scripts have certain requirements, be it a specific version of PHP or MySQL, or perhaps a specific Perl module is required. Make sure your webhost account has these requirements, in cPanel most version and module information is located on the left, when you login to your account.

Choose The Correct Script:

Resist the temptation to install a beta or alpha version of a script, unless absolutely necessary. Even full releases of scripts can have defects, and so sometime it is better to wait a couple of days after a regular stable release.

Unless you are a programmer and want to rewrite the script, a good support site for your script is a must. This is even more critical if you are paying for the script. You will want to make sure the script is at least supported and if you run into a problem you have a resource to go to.

Never install a script without looking at some of the script’s source code. If the script is encoded, so you can not see the source code, you will want to make sure the author of the script is reputtable. Any script you install on your site has security risks and installing a script you know nothing about is just asking for trouble.

Read The Installation Intructions!

This can not be said enough, because even when you read the instructions, you might miss something, so make sure you follow each step of an installation process.

Most MySQL based applications will require you to creat the database and user account in cPanel prior to installing the script, so make sure you do this if needed.

A configuration file is often edited manually if there is no installer for the script.

If you run into problems running the installer or getting the script to run after installation, make sure you have the right file or folder permissions. Some webhosts require that all php scripts be set to 644 and all directories to 711 or 755, otherwise apache will return an error 500. You should be able to fix permissions with your FTP client or in cPanel’s File Manager.

Secure Your Script:

If the installation went fine, and everything is running okay, the final step is to go back and delete any installer file or reset permissions to 644 for certain configuration files. Again, go back to the script’s documentation and see if there is any manual cleanup or security changes that you need to make once the program is installed.

ToDo List for New Web Sites

When setting up a new web site, there is obviously a lot of initial work, like choosing your domain name and setting up your main page, and then there are the little things that we either do not know about or which we totally forgot about. This short list highlights some of those things.

Contact Information

Your new domain, should have at least two email accounts setup. One is postmaster@yourdomain. All email servers are required to have a default postmaster address, so this should be the first email account you setup. The other account will usually be some sort of default contact address, like webmaster@yourdomain. This will be the address any comments or inquiries from your site will be sent to.

In cPanel, you setup email accounts from the Mail manager control panel.

Note that you do not have to publish the default contact address anywhere on your site, in fact it is recommended that you setup a Contact Page which site visitors can enter comments into a form that automatically emails the webmaster account. The reason to use a form instead of a normal mailto link, is so that this email account does not get bombarded by spam emails.

Secure Your Site

Depending on your website, you might want to maintain a certain level of privacy on your site.

The first thing to do would be to turn off Indexing for certain directories that you do not want to be listed. The default behavior of Apache is to load an index page of some sort, but if none exist, Apache will list the content of the directory. For example if you have a directory named myfiles and it does not have an index page, then if someone goes to yourdomain/myfiles/, it will list every file under this directory.

To turn off indexing, go to cPanel: Index Manager and turn off indexing for each directory you want to change. You must do this individually for each directory.

But even if you turn off indexing, search engines like Google and Yahoo can still index your site, including these directories, so you will want to setup a proper robots.txt file at the root of your public_html folder.

A simple robots.txt file like this tells all spiders to crawl your site, but to stay out of the specified directories:

User-agent: * Disallow: /cgi-bin/ Disallow: /myfiles/

SearchEngineWorld has a tutorial on the robots.txt file, if you are interested in customizing your settings further.

SearchEngineOptimization

It only makes sense that once you secure your site, that you now want to make it searchable and popular with Google, Yahoo, and others, right? Although you could seriously spend your life time researching the ins and outs of the Google PageRank, let’s just cover the basics of getting your site listed.

Permalink it!

It is best to have a well organized site, meaning you need to make sure your site can be easily navigated. If you have good page navigation or if your weblog application does that for you, you will want to make sure that all urls for your site are searchenginefriendly, for blogs and contentmanagement programs, this means enabling some kind of permalinks structure, which turns complicated numerical urls into nice word friendly urls like /all-about-skateboarding, instead of /217621382/2812/.

Meta-Tag it!

Then there are Meta-tags which may or may not matter for higher pageranks, but are important nonetheless if you want to describe your content in any meaningful way. To learn more about meta-tags, read this excellent tutorial.

SiteMap it!

Any decent size site needs some sort of site map page to show search engine spiders where to go, and to help vistors learn more about your site. There is no rule as to what a site map page should look like, but Apple.com’s Site Map is a good example of how to organize one.

The 404 Error Page!

A customized 404 Error page is essential if you want to redirect visitors and keep them from leaving your site. See my 404 Error Page Tutorial on how to set one up in cPanel.

The Favicon

Perhaps nothing makes your site more unique then the favicon. For people who bookmark your site, the favicon is that little icon that shows up in My Favorites in IE or under your Bookmarks in Firefox. Not all web sites have one, and even the implementation for it is different depending on the browser, since the favicon is not an officially recognized standard.

I tend to use the following code for it:

<link rel="icon" href="http://domain_name/favicon.ico" type="image/x-icon" />

Doing a search on Google for Favicon Tutorial should give you plenty of tutorials for doing your own. Essentially a favicon is 16×16 pixel graphics file which you can create in most graphics programs. This tutorial shows you how to do one in Photoshop.

Choosing Your Domain Name

There are many steps to starting a web site, but the most often overlooked one is the first one, namely what domain name to register for your new site. Undoubtedly, almost any single word domain name is taken by now, and two word domains are also hard to come by, so this leaves you with a three word domain. For example when I started WebKeyDesign, I had to choose an inventive three word domain name like WebKeyDesign.com, because KeyDesign and WebDesign.com were already taken, and even a less business like name, like BlueMidnite.com or .net were taken. There are of course the letter and number combination domain names like 123DesignSomethingName.com or perhaps 931DesignSomethingName.com, but these domain names are not as valued due to their complexity.

So what exactly is a good domain name?

First off know who your target audience is for you site.

If it is a business, you will want to have your company name as your domain name, but seeing how many domain names are already taken, if you absolutely can not get YourCompanyName.com, the second best option is to come up with a name that describes your business or what you sell, like SteelSecurityDoors.com or BestValueDoors.com.

For a personal site, like a web log, you have a lot more options. Most business sites want to be a .com site because 9 out 10 times, people will try a .com site first. But for a personal site you can choose a .net or .info level domain name. I would not recommend naming your site FirstNameLastName, unless you really want to promote yourself as a business, instead you should probably choose a domain name to reflect your content, like DebbiesFishTales.net or AngryManCries.info are some interesting names for weblog sites. If you are going to be a technology oriented site, you should strive for a .net domain name if you can.

Avoid hard to spell domain names.

Although you can spell words differently, many people will find it hard to find your site if you use a peculiar spelling or use a hard to remember combination like BestValueDoorsbyACMEandSons.com. You want to keep the domain name easy to remember and easy to spell to help people remember your site.

Do not choose a domain that resembles another one.

Some people have deliberately chosen domain names that either mimics or highly resemble more famous domain names like Microsoft.com or even Google.com. This just invites lawsuits and other legal issues. It is best to choose a domain name that does not lend itself to such reputations. If you find that your domain name is taken for a .com, but not a .net or .info, think about what your site will have in common with the currently registered site, if there is any similarities, I would consider choosing a different name altogether.

Lastly, to check if a domain name is taken, you can go to Network Solutions or any other registrar and lookup any domain name to see if it is taken already.

If you absolutely can not think of anything on your own, you might consider using a domain name generator. This Site Point article discusses popular domain name generators.

Choosing your domain name is an important step into creating your web site identity, it will define your site immediately and impact it in more ways than you know, and so you should take some time in choosing the right one for your site.