How to save (or mirror) an entire website with httrack in Debian and Ubuntu

Httrack is a tool for copying and saving an entire website in Debian and Ubuntu. Httrack can crawl an online website save each of the pages (including graphic and other downloadable files).

Among httrack features are:

  • Able to continue interrupted downloads
  • Selective download
  • Customizable user-agent
  • Customizable Scan-rules, can exclude files from being crawled
  • Accept cookies
  • URL hacks
  • Tolerant requests support
httrack screenshot

Using ‘httrack‘ is easy, as it has built-in wizard that can guide you through the process of mirroring web sites. The user will be asked a series of question about the URL to be mirrored, the location where the files will be saved, proxy server and the user-agent to be used.

p/s: httrack perhaps is the only open-source website copier/downloader tool available for GNU/Linux operating system. It is efficient and easy to use. The only gripe that I’ve when using ‘httrack‘ is that it does not provide progress feedback (unlike its counterpart in Microsoft Windows) like ‘wget

Linux Package Manager Cheat Sheet Reference Chart

Linux comes in many flavors or distros, and each distro handles software installation differently from one another. Most GNU/Linux distro uses a package management system to manage software updates/instalation/removal in order to help users administer their Linux systems.

However, many of these package management system has different interface and commands, as such users from Ubuntu (or Debian based) might only be familiar with ‘apt’ or dpkg while Fedora (Red Hat based) users might only familiar with yum and rpm, which may create confusion when users from either distro were to exchange environments.

Luckily, somebody was kind enough to provide these users with Linux Package Manager Cheat Sheet which act as a reference point whenever a user had to switch to another distro which uses package management that are not familiar with them.

The package management software listed are for: apt,dpkg,yum, rpm, pkg* (slackware based) and AIX-based lsl**.

[ Source ]

How to optimize MySQL tables automatically using cron

Busy websites which has a lot of insert/delete transactions may introduce fragmentation in MySQL tables. Fortunately, users and optimize mysql tables with ‘OPTIMIZE TABLE’ command, but how to execute it automatically?

Here’s how:
The mysql-client package in Ubuntu installation comes with a tool called mysqlcheck which is handy for optimizing table in mysql. This command can be executed from bash and can be executed using cron.

to do that, just run this command.

[bash]
cron -e

#in the crontab file– add this line
59 23 * * * /usr/bin/mysqlcheck -o -v -u <mysql username> -h localhost <database_name> -p <password>
[/bash]

This will tell cron to execute mysqlcheck and optimize mysql table of the specified database exactly on 11:59pm, every day. You can change the setting to suit your need.

How to change hostname in Ubuntu server

Here’s how you can change hostname in Ubuntu server

1. Edit /etc/hostname, and change the hostname
2. Edit /etc/hosts file, and add the hostname to 127.0.0.1, or to any local machine ip
3. run, “sudo server hostname stop”, and “sudo server hostname start”

How to Hide Apache2 and PHP version without using mod_security in Ubuntu Linux

Although security by obscurity is not the best policy to protect your IS assets, but it is still useful to thwarts simple network scanner or newbie crackers.

Note: This tip is written for Ubuntu Linux, the steps is similar to other GNU/Linux distro, albeit with a slight variant.

Hiding Apache2 version
Edit /etc/apache2/apache2.conf

Add these lines at the end of the file:
ServerSignature Off
ServerTokens Prod

Restart Apache2
[bash]
sudo /etc/init.d/apache2 restart
[/bash]

Hiding PHP version
Edit /etc/php5/apache2/php.ini file

Find these lines, and switch it off:
expose_php = Off
display_errors = Off

Additionally you may disable certain ‘risky’ functions in php by editing the disable_functions line:
disable_functions = phpinfo, system,show_source,

Finally, you may restart Apache2 web server.
[bash]
sudo /etc/init.d/apache2 restart
[/bash]