Geek's Log: April 2007

Tuesday, April 24, 2007

The Computational Chemistry Cluster of IIITMK

The Computational Chemistry Portal project in IIITM-K help to work in frontier areas in chemistry through interaction with world-class experts and web-accessed resources in order to build Quality education in Chemistry interactively from basic to graduates,post graduates and research levels.The specific aim of this portal is that it has to enable computation in chemistry through grid access to open source software and high end computers spread accross universities and institutions.The use of the Web-enabled software WebMO is a good step ahead in this endeavour.

This project aims at making use of the computational infrastructure of the IIITM-K especially the high-end servers like Linux and Solaris.As IIITM-K has a 24x7 datacenter facilities and high bandwith availability, this portal has found it to be a good choice to host it there.Moreover, the project here are strongly based on service-oriented principles and ready availability.

Figure 1 : A typical HPC Stack

IIITMK has a coherent Servers Farm comprised of seven high-end servers and a CD-stack server with total capacity approaching a terabyte. It provides a wide variety of Intranet and Web space services. This server farm is accessible from anywhere facilitated by policy-based access mechanisms. Most of the services are accessible through 'My Desktop' and the dot NET Enterprise Servers and Linux Servers.

Let me chalk out the computational resources employed in this portal under IIITM-K.The Computational Chemistry Portal,its associated services like Blog, the analytical chemistry portal,etc are hosted under high-speed SUN Solaris servers which make the web part.The backend of this portal is a set of High-performance Cluster(HPC) machines which provide aggregate computational speed of 5 individual machines.

Apart from this, there are the high-performance machines for separate computations which are operated through console or shell.There are also many open source visualization and analytical tools or softwares as part of the project and easily downloadable from the portal.The portal conducts workshops, hand-on labs, etc from time to time and undertake research projects, training ,etc side-by-side as its activities.

Computational Resources : Statistics

Solaris Webserver (SUNFire v120) with Storedge Storage Server(in Terabytes) in cluster
Computations Server with 2.4 GHz and 2GB RAM for shell computations
Computational Cluster(HPC) with 5 nodes and having each 2 GHz processor and 256 MB RAM fully operational with WebMO for web-enabled computations
Independant workstations for users with 2.4 GHz and Windows preinstalled
Computational Packages like: GAMESS,NWChem,WebMO(licensed),QMView,Molden,Ghemical,gromacs,dgbas and lots of other open source and free softwares for Structure Computations,Molecular Dynamic Computations and Visualizations.

Overview of the Cluster

Figure 2: Cluster Schematic Diagram

We are having a Beowulf Cluster for Computational chemistry.It has 5 nodes in total - A head node and 4 compute nodes which assist in the partitioning of the work.We have some parallel jobs in computational chemistry submitted to the machine. The cluster makes the computations 30 times faster than the computation in a single machine with greater RAM and processor speed. In fact, it consists of 4 machines with Intel 2.93*2(dual core) processors and 512 ram each. The NFS mount has greatly reduced RAM consumption acting as swap. Also,it is available for the 4 parallely connected system.Also we are planning for using SUN grid engine integrated to the cluster to provide grid services.

Cluster software’s

The following software’s are installed:

1. GAMESS in Parallel

2. NWChem in Parallel

3. Tinker in sequential

4. WebMO on Parallel installed in head node.

The applications are installed in /export/apps/

(Only root has access to modify the apps directories and files)

The cluster can be monitored through the url http://192.168.1.12/ganglia locally.

WebMO is yet to be programmed and configured for automatic host detection.Currently GAMESS and NWChem are hard coded for parallel execution.Cluster Front-end Webpage can be accessed at : http://192.168.1.12 ,locally

Cluster Configuration

Compute clusters – cluster.iiitmk.ac.in (Head node)

Compute-0-0.local

Compute-0-1.local

Compute-0-2.local

Compute-pvfs-0 (Compute nodes)

Compulsory services on nodes:

nfs, sshd, postfix, gmond, nfslock, network, gmetad, iptables, mysqld, httpd

Partitioning scheme

/                          Label =/           5.8GB(/dev/hda1)

/state/partition     /state/partition  1 67GB

swap                                           1GB

'/state/partition1' is mounted on ‘/export’ on the head node and /export is exported to other nodes via NFS. Whenever you create a user its home directory is created on “/export/home/username”. When user logs in, it is mounted on /home/username.

On compute nodes, a cluster. local:/export/home/username is mounted as /home/chemistry. This is available on each node.Applications that are compiled custom are placed in ‘/export/apps’ on the head node and exported as /share/apps on compute nodes.

deMon,dgbas,fftw,g03,gamess,gromacs,gv,NWChem,tinker

A directory called ‘scr’ is created on /state/partition and permissions are changed to user chemistry as owner. This /scr partition is not shared (Only the head node partition, /state/partition is shared).

You can execute same command on all nodes by issuing simply once in the head node in this manner:

# cluster-fork “command name”

e.g.: # cluster-fork “df –h”

(Next - Backup Scheme of Computational Clusters)

Friday, April 20, 2007

The Google Architecture

I was in through various aspects of google technology for past 1 month and i thought it is good to alert you to peek into the underlying unique technology of the Internet Search engine giant.Unlike common high traffic servers' architecture, the Google cluster architecture is based on strong software and lots of PC's. Some say more then 15,000 PC's are taking part of Google's phenomenon.

I am referring the Quazen web article which is a fine imprint of the technology.Let's take a look at this wonderful search engine intestine.

Google's architecture provides reliability in the Google's servers and Pc's environment at the software level, by replicating services across many different machines. Google is also proud of its own failures detecting mechanism which handles different threats and malfunctioning in its web .

The mechanism : When a user enters a query to Google the user’s browser first performs a domain name system (DNS) lookup to map www.google.com to a particular IP address. To provide sufficient capacity to handle query traffic, the Google service is being spread to multiple clusters distributed worldwide.

Each cluster has around a few thousand machines, and the geographically distributed setup protects Google against disaster at the data centers (like those arising from earthquakes and large scale power failures).

A DNS-based load-balancing system selects a cluster by the user’s geographic location to each physical cluster. The load-balancing system minimizes back and forward trips for the user's request.

The user’s browser then sends HTTP request to one of these clusters, and thereafter, the processing of that query is entirely local to that cluster. A hardware-based load balancer in each cluster monitors the available set of Google Web servers (GWSs) and performs local load balancing of requests across a set of them. After receiving a query, a GWS machine coordinates the query execution and formats the results into HTML response to the user’s browser. Query execution consists of two major phases.

In the first phase, the index servers consult an inverted index that maps each query word to a matching list of documents (the hit list). The index servers then determine a set of relevant documents by intersecting the hit lists of the individual query words, and they compute a relevance score for each document.

This relevance score determines the order of results on the output page. The search process is challenging because of the large amount of data: The raw documents comprise several tens of terabytes of uncompressed data, and the inverted index resulting from this raw data is itself many terabytes of data. Fortunately, the search is highly parallelizable by dividing the index into pieces (Index shards), each having a randomly chosen subset of documents from the full index. A pool of machines serves requests for each shard, and the overall index cluster contains one pool for each shard. Each request chooses a machine within a pool using an intermediate load balancer which means - each query goes to one machine (or a subset of machines) assigned to each shard. If a shard’s replica goes down, the load balancer will avoid using it for queries, and other components of our cluster management system will try to revive it or eventually replace it with another machine.

During the downtime, the system capacity is reduced in proportion to the total fraction of capacity that this machine represented. However, service remains uninterrupted, and all parts of the index remain available. The final result of this first phase of query execution is an ordered list of document identifiers (docids). The second phase involves taking this list of docids and computing the actual title and uniform resource locator of these documents, along with a query-specific document summary.

Document servers (docservers) handle this job, fetching each document from disk to extract the title and the keyword-in-context snippet. As with the index lookup phase, the strategy is to partition the processing of all documents by randomly distributing documents into smaller shards, having multiple server replicas responsible for handling each shard, and routing requests through a load balancer.

It would be worthwhile at this moment to refer some of the research papers on different aspects of this tehnology,

Web Search for a Planet: The Google Cluster Architecture

Luiz Barroso, Jeffrey Dean and Urs Hoelzle

Bigtable: A Distributed Storage System for Structured Data

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike

Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber

The Google File System

Sanjay Ghemawat, Howard Gobioff, and Shun Tak-Leung

The Chubby Lock Service for Loosely-Coupled Distributed Systems

Mike Burrows

The Google Technology

Chapter from the book 'The Google Legacy'(PDF - right click and save target)

Reference : Article on Quazen Web on 'Google Cluster Technology' and Google labs (http://labs.google.com) for Research Papers on Google Technology

Sunday, April 15, 2007

Problem of hosts blocked in MySQL because of 'max_connect_errors' variable

`Problem : "Host 'host_name' is blocked" error` in mysql version 4.0 and later

This is a situation encountered in remote database connections to a typical mysql server (versions greater than 4.0).The mysql can be either installed in linux or windows platform.But such an error is thrown up when a user is trying to connect using sqlyog to that machine through the network.Some more factors that you should check is given first.

Checklist:

Make sure that firewall is not running in blocked state in the client or the server machine.This can block the connection attempts.
Make sure that no more clients other than the allowed no: of clients are trying to connect to the server at a given point of time.

If you get a 'Too many connections' error when you try to connect to the mysqld server, this means that all available connections are in use by other clients.

The number of connections allowed is controlled by the max_connections system variable. Its default value is 100. If you need to support more connections, you should restart mysqld with a larger value for this variable.

mysqld actually allows max_connections+1 clients to connect. The extra connection is reserved for use by accounts that have the SUPER privilege. By granting the SUPER privilege to administrators and not to normal users (who should not need it), an administrator can connect to the server and use SHOW PROCESSLIST to diagnose problems even if the maximum number of unprivileged clients are connected. See See reference in MySQL site

The maximum number of connections MySQL can support depends on the quality of the thread library on a given platform. Linux or Solaris should be able to support 500-1000 simultaneous connections, depending on how much RAM you have and what your clients are doing. Static Linux binaries provided by MySQL AB can support up to 4000 connections.

Now we can come to the error message which is the heart of the problem.

Error :`Host 'host_name' is blocked.`

If you get the following error, it means that mysqld has received many connect requests from the host 'host_name' that have been interrupted in the middle:

The actual error listing goes like this.

Host 'host_name' is blocked because of many connection errors.Unblock with 'mysqladmin flush-hosts'

The number of interrupted connect requests allowed is determined by the value of the max_connect_errors system variable. After max_connect_errors failed requests, mysqld assumes that something is wrong (for example, that someone is trying to break in), and blocks the host from further connections until you execute a mysqladmin flush-hosts command or issue a FLUSH HOSTS statement

By default, mysqld blocks a host after 10 connection errors. You can adjust the value by starting the server like this:

shell&gt; mysqld_safe --max_connect_errors=10000 &

If you get this error message for a given host, you should first verify that there isn't anything wrong with TCP/IP connections from that host. If you are having network problems, it does you no good to increase the value of the 'max_connect_errors' variable.

It is also handy to issue a flush hosts command.Refer the following link for the syntax:

http://dev.mysql.com/doc/refman/5.0/en/flush.html

After these steps,reconnect using 'Sqlyog'.Wishing good days.

Source Documentation : MySQL Developer Zone

Wednesday, April 04, 2007

Scribefire - Fire up your blogging

Scribefire - Preview

ScribeFire - earlier called 'Performancing' ,is a blog editing tool which can integrate with your firefox browser and lets you easily post to your blog.It opens a split panel that provides a full of text-editing tools, complete with rich/source editing tabs, a live preview, a post history and more. You can easily drag and drop text, images and links to the ScribeFire pane, making it super-easy to reference other sites and info within your blog entries.

The latest version, 1.4, promises improved support for Blogger accounts and better file uploading. It also supports the new WordPress API.You can hook it up to your personal blog and fire your words whenever you find convenient !!!

In addition to Blogger and WordPress, ScribeFire works with Jeeran,LiveJournal, TypePad and Windows Live Spaces. This killer extension costs nothing; it requires Firefox 1.5 or later.