How to find cause of heavy usage on your Apache webserver

Here’s a quick and dirty tips on how to find cause of heavy CPU resources usage on your Apache webserver (especially when running php scripts).

First you need to locate the Apache 2 “access.log” file. By default in Ubuntu, this file is located in “/var/logs/apache2” directory.

Then you need to run this command to find out which IP address accesses your website the most in a short time.
[code]
tail -10000 access.log| awk ‘{print $1}’ | sort | uniq -c |sort -n
[/code]

The output of the command should contain a list of IP addresses along with the number of hits it made in the last 10,000 access of your website
[code]
47 117.58.252.98
81 202.124.242.186
84 202.124.245.26
182 194.164.101.217
220 208.101.22.146
225 72.167.131.144
3946 93.135.xxx.xxx
[/code]

From here you can easily locate the offending ip address and proceed to block it from accessing your website further using .htaccess file or other blocking method.

Here is an example to block certain ip address from accessing your website using .htaccess file
[code] order deny,allow
deny from 93.135.xxx.xxx
[/code]

Save .htaccess file in the root directory of your web server (example /var/www), and the ip address wont be available to access your site again.

Hope that would help you!

HP Officejet 5680 – How to Send Fax from Ubuntu Linux Computer

This serves as a continuity from my previous post, I got myself a new and flexible Printer – HP Officejet 5680 All in One.

HP Officejet All-in-One Under Ubuntu
Its all seems rather easy at the way I left off, the printer is fully functioning by just plugging it to my Ubuntu pc, the scanner works well without I having to do anything special, the phone is fully functioning (yeah it is included with the printer) and I am able to send and receive fax without a hitch, something that I cant do without installing 350MB of companion application (half of it was crapware) under WIndows Vista.

Everything worked, what left to do?
What left to do is figuring out how can I send fax directly from Ubuntu (or other Linux base operating system) using only digital files (*.txt, *.pdf, *.ps, *.jpeg), so I don’t have to print those files and fax them one by one anymore.

HP Linux Imaging and Printing project
Through googling, I found that Hewlett Packard (HP) has published open source software tool to deal with their printers. Free and Open Source drivers and printer-specific application directly from manufacturer, which is very cool!

Fortunately Ubuntu already installed HPLIP tools by default with CUPS in my machine. The next step that I should take is to run ‘hp-setup’ as root to configure my printer port and run the ‘hp-sendfax’ application to send the faxes.

Both of these tool requires python-qt3 package which is available from Ubuntu software repository.

Now I can fax my pdf documents directly without having to print them first, a huge saving over ink and paper cost.

hpfax11.png

hp2.png

Conclusion
If you are planning to get a new printer, then I would suggest you get a HP printer. Not only because HP printers are reliable, but they also comes with Free and Open Source drivers and applications for the Linux based operating system. Well that’s a good reason to get HP printers.

Please visit HPLIP project website for more information about HP printers support under Linux based operating system.

[tags]hp,hewlett packard,printer,linux,opensource,ubuntu,foss,drivers,hardware,scanner,officejet[/tags]

I’m selling Webbots, Spiders, and Screen Scrapers Book

Hi there, I’d like to announce here that I’m offering to sell Webbots, Spiders, and Screen Scrapers : A Guide to Developing Internet Agents with Php/cURL for a heavily discounted price to Malaysian. The book is new and it is still in its original packaging from Powells.com, the reason for selling it is there’s a confusion during the ordering process that I received two books of the same title, so I’m selling it.

The book is interesting because it detailed on how to build efficient and stealthy web robots to manage your daily tasks, data mining and website monitoring. There are also 3 chapters in this book which deals with bots designed for crawling other protocol than web (ftp, nntp, emails)

This book is a must have if you’re interested to learn about how to build and deploy webrobots which is still considered a black art even today.

Details
webbots.jpg

My price : RM85 (USD27) – inclusive post cost
Pay to : Maybank,CIMB, Paypal
Delivery : by PosLaju
Condition: New (you can smell the ink + original package)

Please contact me if you’re interested.

RFC 2822 Email Validator in PHP

Here’s regex pattern that checks for email validity that conforms to RFC 2822 specs : regex.txt
This may be useful for you if you’re writing a robust email validation class/function in PHP, that checks for validity according to specs. This will also indirectly address security concern against injection attempt by malicious users.
Additionally, there’s a demo with complete source code for checking email validity using eregi : Validate Email Addresses using Regular Expressions

If you are planning to validate email addresses for use in a home-made php mailer form, then you should read this too : Sanitize Your Forms

You might find it handy as it guard your form againsts malicious users that want to manipulate your form to do Email Injection for spam purposes
[tags]php,regex,email,validator,validate,source code,spam,injection[/tags]

owasp php filters – help sanitize php variables

Internet is full of spam bots, autosubmitters, malicious users and worms that can compromise the security of your website at any given time, therefore you should be suspicious of any data you receive via GET/POST variable in your system.

Among the nasty things that could happen to your system when you don’t filter your data is, SQL injection, Script Injection, Email abusing and Remote Execution the attacker could deface your website or even wipe your entire database if you’re not careful with it.

One of the way to filter your data is to use preg_match to write regex rule for the variable that would be accepted.

However I find writing preg_match sometimes can be tiring, and that’s why I use owasp php filters to simplify the work for me. It consists of one function sanitize(), that take the variable that you want to filter and an option.

The option may be any of this value PARANOID,HTML,INT,FLOAT,LDAP,SQL,SYSTEM and UTF-8 that filters the type of data accordingly. For example if you want your variable to contain only floating-point number, then you can code it like this :

< ?php

require('sanitize.inc.php');

$var=100.50;

$float = sanitize($var,FLOAT);

?>

I isn’t much, but surely it will simplify your php coding a bit more, the other option is self-explanatory save PARANOID, which means that the variable will contain only alphanumeric character after sanitize.

SQL is handy if you want to include the variable value inside an SQL statement, this will avoid the risk of the notorious SQL injection which will affect the security of your data.

you can download OWASP PHP filter here

[tags]php,security,filters,mysql,sql,sql injection,injection[/tags]

Debian GNU/Linux Hands On Guide and Tutorial

mypapit debianAfter surfing for leisure last evening, I found a nice and comprehensive guide written for Debian GNU/Linux users. I guess Debian GNU/Linux beginner and GNU/Linux newbies will find this guide, Hands-on Guide to the Debian GNU Operating System useful.

The guide covers from introduction of Unix, Free Software Foundation, the GNU project and the Linux kernel to the basics of system management such as package management, setting up firewalls, configuring X Windows system and recompiling the default kernel.

The guide is also available for download at : http://colt.projectgamma.com/

Happy huntin’