Blogs Eye

Down Again, Up Again

(WordPress 4.1 Beta1 has a new fancy post writing interface that is very annoying. Things keep popping up or disappearing. I’ll have to get used to it.)

I woke up this morning to 500 errors on all my sites. The new software intercepted another toxic code insertion attempt and logged it. It scrambled the end of my htaccess file. I am going to stop listing the reasons for blocked code in my htaccess. I thought that I put a nice filter on the reason codes, but for some reason it did not work.

The good news is that the malware detection code for the new plugin stopped a really noxious attack on my site. I think that I might make this a standalone plugin and release it to the WordPress repository and I might stop some site hacking.

I an thinking about checking file uploads, too. I could then check for nastiness in uploaded files before the upload is processed.


SSL and Spam

I have configured my site JT30.COM to use HTTPS. I used my CloudFlare account. My hosting company charges $200 to configure SSL on my site, so the free CloudFlare account is for me.

I am configuring SSL to 1) see how it interacts with my spam software, and 2) improve my website’s page rank with Google who now gives extra points for SSL.

The setup went easy. I configured on CloudFlare. CloudFlare told me that I had to change my DNS so I did it. Luckily I had not actually visited the JT30.COM site today, so I did not have to worry about cached DNS. Sometimes you have to wait a few hours for DNS to refresh.

After I finished configuring CloudFlare I pinged the site and it showed that CloudFlare was working. I went to the site and it was still there. I tried HTTPS://www.jt30.com and it was not working correctly. The web page showed http: resources such as CSS files that were not loading on the https: site.

I tried changing the WordPress settings to HTTPS for the site address and hung up immediately with a redirect loop error.

I had to log into PHPMyAdmin through the host panel and manually fix the wp-options table to get it back to http.

I then tried the CloudFlare SSL plugin. After installing it I tried https: and it worked. I then changed the site URL to https and that worked.

So it only took about 10 minutes to switch over and the only minor hiccup was WordPress trying to do endless redirects when I turned on https. The plugin stopped that right away.

The secure icon has an exclamation point because there is no Site Information configured. I assume this is a CloudFlare issue so I have to research that.

Now I will monitor the traffic and see it it improves as Google finds out that I did this.

I will watch the spam statistics and see if I get any errors from JT30.com.

If I go a week without major crashes, I will start switching over other sites to HTTPS.


AudioCD.com

Back in 1998 I registered AudioCD.com. In those days Network Solutions was the only way to register a domain and Internic.net allowed me to search for domain names. I found some dictionary databases and lists of common English phrases and ran millions of checks against the whois database looking for good domain names. NetSol shut me down after about a month where I was hitting them at 40,000 request an hour. I discovered lots of good names. In those days there were even a few two character names still available. I remember that I could have registered several cool two digit domains with a dash in them like “-0″ or “Z-” and I should have. Who knew?

I found AudioCD which seemed to me to be a money maker, and I bought the domain for $40. In those days you needed a DNS to buy a domain so I used my companies DNS. I’ve sat on the domain since, hoping to one day sell it.

I waited too long. The world has moved beyond Audio CDs and is now digital. You can still buy CDs as some stores, but nobody does.

I have decided to keep the domain for one more year (cost about $13) in the hopes that I can think of something good to do with it.

I have a few other domains that I am letting slide. GThread.com will expire next month and some more early next year. I am letting go my Baseball “Magic Number” sites because Facebook has made them obsolete. These site don’t pay for themselves, although at one time I made lots of money on them.

I own KPGraham.com and KeithGraham.com and I think I should drop these, too. I don’t need a vanity site for my resume anymore. I don’t sell myself.

I have 40 domains I intend to bring that down to 25 by this time next year. In the mean time I am going to put some of them up on an auction site to see how they do.


Accept Headers

Accept headers are sent by the browser to the server to tell the server what kind of stuff the browser can handle. Servers don’t need an accept header. The accept header is described as optional in HTTP documentation.

If you don’t send my software an accept header you are blocked, banned, and reported.

How can this be? The reason is simple. All major browsers; IE, Chrome, FF, etc., send accept headers. If my software does not receive an accept header then the request is not coming from a browser. Humans use browsers to surf the web. If you are not sending me an accept header you are not human, but a robot.

It’s that simple. I am amazed at how many requests are blocked because there is no accept header. I get nervous and check them from time to time, and every time the request comes from a server farm or China or Russia and I can automatically assume that the lack of accept header is because a robot is hitting my site.

I also reject requests because of a missing http_host header, and when the method is “POST”, a missing http_referer header, or a http_referer header that does not match the website.

So if you are writing any software that needs to hit my site, including rss readers or automatic gets of my spam lists, then you must provide a set of real http headers.


Malicious attack last night

I have an experimental plugin that detects SQL injection and malicious code insertions attacks. I watch the Apache logs and another log that I create for odd things, and I have found robot probes that try to insert SQL into get strings or PHP eval functions that load up encrypted code.

The plugin works well and I catch dozens of attempts per day. Unfortunately, the plugin that updates the htaccess file in real time now puts a comment that contains the offending string. One of the strings had an interesting combination of garbage (by chance, I think) that corrupted the htaccess file. As a result the site was down from around 1 AM until 8:40 this morning. I have fixed the plugin to properly truncate and encode strings so that the file does not screw up again.

I am sorry for anyone who needed to access the site during this time.

BlogsEye.com is my test bed. I run bleeding edge nightlies from WordPress and if it goes down there is no great loss to me. I just fix the problem and bring it up. There are, however, a hundred or so surfers who appear to be human according to the logs, so the problem must have blocked at least a few dozen people from accessing the site last night. Probably the greater damage to me was the beneficial spiders like GoogleBot hitting a brick wall and hurting my search rankings.


htaccess master list rebuilt and plugin progress.

I started the bad neighborhood concept a few months ago. Basically it’s one strike and you’re out. If I receive spam, a threat, WGET, or an over-active spider at any of my sites, I block the IP immediately.

I have a new plugin tentatively titled WP Protection that does this. This is my replacement for the Stop Spammer plugin. I’ve ripped out the various tests from Stop Spammers and put them into a more Object-Oriented framework. It can either block the spam, or update the htaccess file with a deny, or both.

I have a semi-automated step that does an inquiry at lacnic.net for each blocked IP and returns the IP range for the offending IP. This is called a CIDR. If a server company tolerates a spammer, I can block the whole server company CIDR. This is extremely aggressive and I wound up blocking most of China, Russia, and all of Vietnam. It is necessary to block the CIDR as most residential ISP addresses are dynamically allocated and a user gets a new IP each time they log in. Blocking individual IP addresses is like a game of Whack-A-Mole.

I’ve made some mistakes doing this and the master list wound up a little scrambled. I deleted it yesterday, and overnight I rebuilt it based on the StopForumSpam.com 7 day list of spammers. It lists more than 4,000 bad neighborhoods. I rewrote the compression program that combines contiguous spaces into one CIDR. It was the source of scrambling. It seems to work well now.

A byproduct of the list is White-List of ranges. These are almost all North American and Western European ISPs. It also white-lists search engines and Amazon AWS. Amazon is a major source of spammers who sign up for a free trial and send out spam for a few hours until Amazon blocks them. However, it is fruitless to permanently block Amazon because they shut down the spammers and blocking an Amazon CIDR winds up blocking some services that I count on, like RSS sharing services and other beneficial robots. I white-list CloudFlare, PayPal and a bunch of other services. I do block Digital Ocean, though. I get too much spam from them, because it is so easy to do a test drive on their system for free, and spammers know this.

I am concerned that the white list is American and European biased where the black list is biased against Russia, China and third world nations. I ran a program that automatically notified service providers whenever I received a spam. With the exception up a few Spam hosts in Canada and the US, Western hemisphere providers promptly addressed the problem whereas Easter hemisphere providers ignored me. If an ISP can convince me that it has cleaned up its act, I will gladly white-list them. I have been searching for lists of residential ISP IP ranges. Hosting companies, however, should never hit my site. I don’t want to white-list hosting companies unless they can convince me that they keep a clean house. I recently received a complaint from someone in Thailand that they were blocked from my sites. Unfortunately, Thailand is one of the worst of the spammer nations, and I have received thousands of spam attempts from IPs in the same CIDR as the person complaining. I feel justified in blocking the whole range.

The WP Protection plugin will not be shared via the WordPress Repository. I have learned my lesson. I cannot make money on WordPress. Neither can I find time to support all the problems users have from installing an aggressive anti-spam program. I have opened an account on ClickBank.com and I will try to sell the plugin there. It will be better for me if I support 50 paid users rather than 150,000 unpaid users.

WP Protection is coming along. I don’t have the settings page yet, so I can’t release it for beta. I will let it go free to a few users when I think it is ready for testing.


.htaccess finally reached the tipping point

The .htaccess file I’ve been using may have gotten too big. I am getting “connection reset” errors on all of my sites. I deleted the deny directives and brought it down to about 1.5k so we’ll see if it clears up. It could be a DOS attack, but I don’t know.

There is always the possibility that I’ve been hacked and that by opening up the site again, I am just inviting people to abuse me.

I’ll watch it for a while and I may add the denies back to the htaccess. In the mean time I am still detecting spam and denying individual IP addresses. I especially want to keep Yandex and Majestik out of my sites.


Less spam

I think my spam blocking is starting to work.

I checked my spam stats today for the first time in a few days and out of 14 blogs I had five hits in the Stop Spammer history. When I started this project Stop Spammers would report about 500 to 1000 blocked attempts across all blogs per day.

The traffic at my sites is down to about one third, but I think that is because so much of my traffic was from spammers and other robots. The income from ads has actually started to climb again. I think I was penalized for a while because spammers were clicking links. (I have heard most of these come from Vietnam).

I run a scan of my logs that calculates what percentage of all hits is blocked with a 403 access denied message and it is down to 15%. When I started blocking, it was as high as 60% of traffic blocked. I think the spammers learn that they can’t access my sites so they stop trying. The 15% that do get blocked look like spammers, though. I can tell they are mostly from countries outside of North America or Western Europe. I hate to discount Russia, China, and third world countries, but honestly, someone from Indonesia is not likely to be interested in my ramblings on science fiction.

I am thinking about making a static clone of some of my websites that don’t have the huge IP deny list. In this way people can get at the content but are not able to login or leave messages. There is an SEO penalty for this, I think, so I will have to consider it.

Since I have been getting so few new spam hits I have started grabbing the Stop Form Spam nightly list. It s about 9000 entries long. I filter and eliminate the IP addresses I know are bad or white listed and it comes to about 400 new ip address a day. I add these to the master list when I generate the ip deny file.

I wrote a plugin that checks incoming IP numbers against my my master “bad neighborhood” list, but it is very slow. I have to figure a way to speed it up for it to be effective.

I am going to spend my lunch hour trying to figure ways to white list rss feeds so people can access them. I have had complaints.

 


Vietnam click farms

I am running my ip range code against the Stop Forum Spam ip list for the last 90 days. I am learning a little bit about who is sending comment spam to blogs.

I’ve been lucky and I am down quite a bit as to the spam load on my site. I have not had to deal with spam that mad it past my plugins for a while, but I find there are constantly new IP ranges that needed to be added to my block lists. That’s why I am buzzing the SFS list.

There are countries that nearly every IP range in the country has at least one spammer. Since I am blocking entire CIDRs, it turns out that I would save time by blocking entire countries.

Vietnam needs to be blocked. The data shows that a great many cell phones in country code VN are engaged in clicking on comments. I wish that I had kept counts, but a great deal of spam is originating from Vietnam.

My sites don’t see much comment spam. I think that the Stop Spammer plugin has resulted in most robots getting a 403 denied message when they hit my site so even the click farms no longer try. My big problem is hack attempts, and I have been concentrating on malicious login or SQL injection attempts because the comment spam seems to be less of a problem.

The Vietnam clicks all appear to be comment spam done by hand. I guess they get some payment for verified comments. (Ever notice that spam messages have a random string somewhere?).

As I finished the data buzz I noticed that almost entire country of Vietnam is now blocked.

India, Bahrain, and Surinam also seem to have much of their cellphone ips blocked.

I was going to try to whitelist residential telephone networks, but that will be impossible. I have to block the entire country of Vietnam from accessing my site.

I am also going to block the entire country of China. There seems to be no legitimate traffic at all from China. All if it is spam along with an incredible amount of malicious attacks on my sites.


60,000 hits on wp-login

I got over 60,000 hits overnight on my CThreePO.com domain. Someone was doing a dictionary attack. I have my websites setup to block this on the first attempt. The Robot got 60,000 403 “access denied” error messages but kept on chugging along going through its password dictionary. Idiots!

I have to add these robots to my htaccess file because if I had 60,000 hits to a PHP file my web host would shut me down. When I first started using this host I was getting lots of angry messages from my hosting company about excess CPU time. It turns out that having a deny ip in the htaccess does not count as CPU time. In the beginning Yandex was hitting my sites over 100,000 times a day. I blocked Yandex and then I have been plugging leaks a little at a time ever since. In the beginning about 95% of the hits to my site were robots. Now I am down to about 20%.

I have a real problem with spammers getting a hold of an Amazon AWS instance and running their robots for a few hours. Amazon always catches them quickly, but there are also some good Amazon based apps hitting my site and I don’t want to block them. I have to be very careful with automatic blocks when Amazon is involved. Amazon has to stop giving away free or cheap trials from fraudulent users.

As I block more and more IP addresses, I get fewer malicious hits on my website. This is a bad thing because it makes it harder to test my new routines. I had no good hits on my¬†known exploits routines , all were blocked by the htaccess file so I don’t know if my new modules are correctly blocking hits to exploited plugins. All morning I’ve had only one new IP address, and that was from a Chinese computer doing a login attempt that was caught the first time it hit my site – boring. I never thought that I would say this, but I need more spammers!


Login Forum