free hit counter

Saturday, February 16, 2008

IBM System Cluster 1350

IBM System Cluster 1350
Reduced time to deployment IBM HPC clustering offers significant price/performance advantages for many high-performance workloads by harnessing the advantages of low cost servers plus innovative, easily available open source software.Today, some businesses are building their own Linux and Microsoft clusters using commodity hardware, standard interconnects and networking technology, open source software, and in-house or third-party applications. Despite the apparent cost advantages offered by these systems, the expense and complexity of assembling, integrating, testing and managing these clusters from disparate, piece-part components often outweigh any benefits gained.IBM has designed the IBM System Cluster 1350 to help address these challenges. Now clients can benefit from IBM’s extensive experience with HPC to help minimize this complexity and risk. Using advanced Intel® Xeon®, AMD Opteron™, and IBM PowerPC® processor-based server nodes, proven cluster management software and optional high-speed interconnects, the Cluster 1350 offers the best of IBM and third-party technology. As a result, clients can speed up installation of an HPC cluster, simplify its management, and reduce mean time to payback.The Cluster 1350 is designed to be an ideal solution for a broad range of application environments, including industrial design and manufacturing, financial services, life sciences, government and education. These environments typically require excellent price/performance for handling high performance computing (HPC) and business performance computing (BPC) workloads. It is also an excellent choice for applications that require horizontal scaling capabilities, such as Web serving and collaboration.
Common features
Hardware summary
Rack-optimized Intel Xeon dual-core and quad-core and AMD Opteron processor-based servers
Intel Xeon, AMD and PowerPC processor-based blades
Optional high capacity IBM System Storage™ DS3200, DS3400, DS4700, DS4800 and EXP3000 Storage Servers and IBM System Storage EXP 810 Storage Expansion
Industry-standard Gigabit Ethernet cluster interconnect
Optional high-performance Myrinet-2000 and Myricom 10g cluster interconnect
Optional Cisco, Voltaire, Force10 and PathScale InfiniBand cluster interconnects
Clearspeed Floating Point Accelerator
Terminal server and KVM switch
Space-saving flat panel monitor and keyboard
Runs with RHEL 4 or SLES 10 Linux operating systems or Windows Compute Cluster Server
Robust cluster systems management and scalable parallel file system software
Hardware installed and integrated in 25U or 42U Enterprise racks
Scales up to 1,024 cluster nodes (larger systems and additional configurations available—contact your IBM representative or IBM Business Partner)
Optional Linux cluster installation and support services from IBM Global Services or an authorized partner or distributor
Clients must obtain the version of the Linux operating system specified by IBM from IBM, the Linux Distributor or an authorized reseller
x3650—dual core up to 3.0 GHz, quad core up to 2.66
x3550—dual core up to 3.0 GHz, quad core up to 2.66
x3455—dual core up to 2.8 GHz
x3655—dual core up to 2.6 GHz
x3755—dual core up to 2.8 GHz
HS21—dual core up to 3.0 GHz, quad core up to 2.66
HS21 XM—dual core up to 3.0 GHz, quad core up to 2.33
JS21—2.7/2.6 GHz*; 2.5/2.3 GHz*
LS21—dual core up to 2.6 GHz
LS41—dual core up to 2.6 GHz
QS20—multi-core 3.2 GHz

IBM System Cluster 1600

IBM System Cluster 1600 systems are comprised of IBM POWER5™ and POWER5+™ symmetric multiprocessing (SMP) servers running AIX 5L™ or Linux®. Cluster 1600 is a highly scalable cluster solution for large-scale computational modeling and analysis, large databases and business intelligence applications and cost-effective datacenter, server and workload consolidation. Cluster 1600 systems can be deployed on Ethernet networks, InfiniBand networks, or with the IBM High Performance Switch and are typically managed with Cluster Systems Management (CSM) software, a comprehensive tool designed specifically to streamline initial deployment and ongoing management of cluster systems.
Common features
·
Highly scalable AIX 5L or Linux cluster solutions for large-scale computational modeling, large databases and cost-effective data center, server and workload consolidation
·
Cluster Systems Management (CSM) software for comprehensive, flexible deployment and ongoing management
·
Cluster interconnect options: industry standard 1/10Gb Ethernet (AIX 5L or Linux), IBM High Performance Switch (AIX 5L and CSM) SP Switch2 (AIX 5L and PSSP); 4x/12x InfiniBand (AIX 5L or SLES 9); or Myrinet (Linux)
·
Operating system options: AIX 5L Version 5.2 or 5.3, SUSE Linux Enterprise Server 8 or 9, Red Hat Enterprise Linux 4
·
Complete software suite for creating, tuning and running parallel applications: Engineering & Scientific Subroutine Library (ESSL), Parallel ESSL, Parallel Environment, XL Fortran, VisualAge C++
·
High-performance, high availability, highly scalable cluster file system General Parallel File System (GPFS)
·
Job scheduling software to optimize resource utilization and throughput: LoadLeveler®
·
High availability software for continuous access to data and applications: High Availability Cluster Multiprocessing (HACMP™)
Hardware summary
·
Mix and match IBM POWER5 and POWER5+ servers:
·
IBM System p5™ 595, 590, 575, 570, 560Q, 550Q, 550, 520Q, 520, 510Q, 510, 505Q and 505
·
IBM eServer™ p5 595, 590, 575, 570, 550, 520, and 510
·
Up to 128 servers or LPARs (AIX 5L or Linux operating system images) per cluster depending on hardware; higher scalability by special order

IBM System p 570 with POWER 6

IBM System p 570 with POWER 6

* Advanced IBM POWER6™ processor cores for enhanced performance and reliability* Building block architecture delivers flexible scalability and modular growth* Advanced virtualization features facilitate highly efficient systems utilization* Enhanced RAS features enable improved application availabilityThe IBM POWER6 processor-based System p™ 570 mid-range server delivers outstanding price/performance, mainframe-inspired reliability and availability features, flexible capacity upgrades and innovative virtualization technologies. This powerful 19-inch rack-mount system, which can handle up to 16 POWER6 cores, can be used for database and application serving, as well as server consolidation. The modular p570 is designed to continue the tradition of its predecessor, the IBM POWER5+™ processor-based System p5™ 570 server, for resource optimization, secure and dependable performance and the flexibility to change with business needs. Clients have the ability to upgrade their current p5-570 servers and know that their investment in IBM Power Architecture™ technology has again been rewarded.The p570 is the first server designed with POWER6 processors, resulting in performance and price/performance advantages while ushering in a new era in the virtualization and availability of UNIX® and Linux® data centers. POWER6 processors can run 64-bit applications, while concurrently supporting 32-bit applications to enhance flexibility. They feature simultaneous multithreading,1 allowing two application “threads” to be run at the same time, which can significantly reduce the time to complete tasks.The p570 system is more than an evolution of technology wrapped into a familiar package; it is the result of “thinking outside the box.” IBM’s modular symmetric multiprocessor (SMP) architecture means that the system is constructed using 4-core building blocks. This design allows clients to start with what they need and grow by adding additional building blocks, all without disruption to the base system.2 Optional Capacity on Demand features allow the activation of dormant processor power for times as short as one minute. Clients may start small and grow with systems designed for continuous application availability.Specifically, the System p 570 server provides:Common features Hardware summary* 19-inch rack-mount packaging* 2- to 16-core SMP design with building block architecture* 64-bit 3.5, 4.2 or 4.7 GHz POWER6 processor cores* Mainframe-inspired RAS features* Dynamic LPAR support* Advanced POWER Virtualization1 (option)o IBM Micro-Partitioning™ (up to 160 micro-partitions)o Shared processor poolo Virtual I/O Servero Partition Mobility2* Up to 32 optional I/O drawers* IBM HACMP™ software support for near continuous operation** Supported by AIX 5L (V5.2 or later) and Linux® distributions from Red Hat (RHEL 4 Update 5 or later) and SUSE Linux (SLES 10 SP1 or later) operating systems* 4U 19-inch rack-mount packaging* One to four building blocks* Two, four, eight, 12 or 16 3.5 GHz, 4.2 GHz or 4.7 GHz 64-bit POWER6 processor cores* L2 cache: 8 MB to 64 MB (2- to 16-core)* L3 cache: 32 MB to 256 MB (2- to 16-core)* 2 GB to 192 GB of 667 MHz buffered DDR2 or 16 GB to 384 GB of 533 MHz buffered DDR2 or 32 GB to 768 GB of 400 MHz buffered DDR2 memory3* Four hot-plug, blind-swap PCI Express 8x and two hot-plug, blind-swap PCI-X DDR adapter slots per building block* Six hot-swappable SAS disk bays per building block provide up to 7.2 TB of internal disk storage* Optional I/O drawers may add up to an additional 188 PCI-X slots and up to 240 disk bays (72 TB additional)4* One SAS disk controller per building block (internal)* One integrated dual-port Gigabit Ethernet per building block standard; One quad-port Gigabit Ethernet per building block available as optional upgrade; One dual-port 10 Gigabit Ethernet per building block available as optional upgrade* Two GX I/O expansion adapter slots* One dual-port USB per building block* Two HMC ports (maximum of two), two SPCN ports per building block* One optional hot-plug media bay per building block* Redundant service processor for multiple building block systems2

Mirror Write Consistency

Mirror Write Consistency

Mirror Write Consistency (MWC) ensures data consistency on logical volumes in case asystem crash occurs during mirrored writes. The active method achieves this by loggingwhen a write occurs. LVM makes an update to the MWC log that identifies what areas ofthe disk are being updated before performing the write of the data. Records of the last 62distinct logical transfer groups (LTG) written to disk are kept in memory and also written toa separate checkpoint area on disk (MWC log). This results in a performance degradationduring random writes.With AIX V5.1 and later, there are now two ways of handling MWC:• Active, the existing method• Passive, the new method

WSM Objectives

WSM Objectives
The objectives of the Web-based System Manager are:• Simplification of AIX administration by a single interface• Enable AIX systems to be administered from almost any client platform with a browserthat supports Java 1.3 or use downloaded client code from an AIX V5.3 code• Enable AIX systems to be administered remotely• Provide a system administration environment that provides a similar look and feel to theWindows NT/2000/XP, LINUX and AIX CDE environmentsThe Web-based System Manager provides a comprehensive system managementenvironment and covers most of the tasks in the SMIT user interface. The Web-basedSystem Manager can only be run from a graphics terminal so SMIT will need to be used inthe ASCII environment.To download Web-based System Manager Client code from an AIX host use the addresshttp:///remote_client.htmlSupported Microsoft Windows clients for AIX 5.3 are Windows 2000 Professional version,Windows XP Professional version, or Windows Server 2003.Supported Linux clients are PCs running: Red Hat Enterprise Version 3, SLES 8, SLES 9,Suse 8.0, Suse 8.1, Suse 8.2, and Suse 9.0 using desktops KDE or GNOME only.The PC Web-based System Manager Client installation needs a minimum of 300 MB freedisk space, 512 MB memory (1GB preferred) and a 1 GHZ cpu.

Directories to monitor in AIX

Directories to monitor in AIX

/var/adm/sulog Switch user log file (ASCII file). Use cat, pg ormore to view it and rm to clean it out./etc/security/failedlogin Failed logins from users. Use the who commandto view the information. Use "cat /dev/null >/etc/failedlogin" to empty it,/var/adm/wtmp All login accounting activity. Use the whocommand to view it use "cat /dev/null >/var/adm/wtmp" to empty it./etc/utmp Who has logged in to the system. Use the whocommand to view it. Use "cat /dev/null >/etc/utmp" to empty it./var/spool/lpd/qdir/* Left over queue requests/var/spool/qdaemon/* temp copy of spooled files/var/spool/* spooling directorysmit.log smit log file of activitysmit.script smit log

What is Hot Spare

What is Hot Spare
What is an LVM hot spare?
A hot spare is a disk or group of disks used to replace a failing disk. LVM marks a physicalvolume missing due to write failures. It then starts the migration of data to the hot sparedisk.Minimum hot spare requirementsThe following is a list of minimal hot sparing requirements enforced by the operatingsystem.- Spares are allocated and used by volume group- Logical volumes must be mirrored- All logical partitions on hot spare disks must be unallocated- Hot spare disks must have at least equal capacity to the smallest disk alreadyin the volume group. Good practice dictates having enough hot spares tocover your largest mirrored disk.Hot spare policyThe chpv and the chvg commands are enhanced with a new -h argument. This allows youto designate disks as hot spares in a volume group and to specify a policy to be used in thecase of failing disks.The following four values are valid for the hot spare policy argument (-h):Synchronization policyThere is a new -s argument for the chvg command that is used to specify synchronizationcharacteristics.The following two values are valid for the synchronization argument (-s):ExamplesThe following command marks hdisk1 as a hot spare disk:# chpv -hy hdisk1The following command sets an automatic migration policy which uses the smallest hotspare that is large enough to replace the failing disk, and automatically tries to synchronizestale partitions:# chvg -hy -sy testvgArgument Descriptiony (lower case)Automatically migrates partitions from one failing disk to one sparedisk. From the pool of hot spare disks, the smallest one which is bigenough to substitute for the failing disk will be used.Y (upper case)Automatically migrates partitions from a failing disk, but might usethe complete pool of hot spare disks.nNo automatic migration will take place. This is the default value for avolume group.rRemoves all disks from the pool of hot spare disks for this volume

AIX Control Book Creation

AIX Control Book Creation
List the licensed program productslslpp -LList the defined devices lsdev -C -HList the disk drives on the system lsdev -Cc diskList the memory on the system lsdev -Cc memory (MCA)List the memory on the system lsattr -El sys0 -a realmem (PCI)lsattr -El mem0List system resources lsattr -EHl sys0List the VPD (Vital Product Data) lscfg -vDocument the tty setup lscfg or smit screen capture F8Document the print queues qchk -ADocument disk Physical Volumes (PVs) lspvDocument Logical Volumes (LVs) lslvDocument Volume Groups (long list) lsvg -l vgnameDocument Physical Volumes (long list) lspv -l pvnameDocument File Systems lsfs fsname/etc/filesystemsDocument disk allocation dfDocument mounted file systems mountDocument paging space (70 - 30 rule) lsps -aDocument paging space activation /etc/swapspacesDocument users on the system /etc/passwdlsuser -a id home ALLDocument users attributes /etc/security/userDocument users limits /etc/security/limitsDocument users environments /etc/security/environDocument login settings (login herald) /etc/security/login.cfgDocument valid group attributes /etc/grouplsgroup ALLDocument system wide profile /etc/profileDocument system wide environment /etc/environmentDocument cron jobs /var/spool/cron/crontabs/*Document skulker changes if used /usr/sbin/skulkerDocument system startup file /etc/inittabDocument the hostnames /etc/hostsDocument network printing /etc/hosts.lpdDocument remote login host authority /etc/hosts.equi

Boot process of IBM AIX

This chapter describes the boot process and the different stages the systemuses to prepare the AIX 5L environment.Topics discussed in this chapter are: The boot process System initialization The /etc/inittab file How to recover from a non-responsive boot process Run levels An introduction to the rc.* files

Storage management concepts

Storage management concepts
The fundamental concepts used by LVM are physical volumes, volume groups,physical partitions, logical volumes, logical partitions, file systems, and rawdevices. Some of their characteristics are presented as follows: Each individual disk drive is a named physical volume (PV) and has a namesuch as hdisk0 or hdisk1. One or more PVs can make up a volume group (VG). A physical volume canbelong to a maximum of one VG. You cannot assign a fraction of a PV to one VG. A physical volume isassigned entirely to a volume group. Physical volumes can be assigned to the same volume group even thoughthey are of different types, such as SCSI or SSA. Storage space from physical volumes is divided into physical partitions (PPs).The size of the physical partitions is identical on all disks belonging to thesame VG. Within each volume group, one or more logical volumes (LVs) can be defined.Data stored on logical volumes appears to be contiguous from the user pointof view, but can be spread on different physical volumes from the samevolume group. Logical volumes consist of one or more logical partitions (LPs). Each logicalpartition has at least one corresponding physical partition. A logical partitionand a physical partition always have the same size. You can have up to threecopies of the data located on different physical partitions. Usually, physicalpartitions storing identical data are located on different physical disks forredundancy purposes. Data from a logical volume can be stored in an organized manner, having theform of files located in directories. This structured and hierarchical form oforganization is named a file system. Data from a logical volume can also be seen as a sequential string of bytes.This type of logical volumes are named raw logical volumes. It is theresponsibility of the application that uses this data to access and interpret itcorrectly. The volume group descriptor area (VGDA) is an area on the disk that containsinformation pertinent to the volume group that the physical volume belongs to.It also includes information about the properties and status of all physical andlogical volumes that are part of the volume group. The information from VGDAis used and updated by LVM commands. There is at least one VGDA perphysical volume. Information from VGDAs of all disks that are part of thesame volume group must be identical. The VGDA internal architecture andChapter 6. Disk storage management 213location on the disk depends on the type of the volume group (original, big, orscalable). The volume group status area (VGSA) is used to describe the state of allphysical partitions from all physical volumes within a volume group. TheVGSA indicates if a physical partition contains accurate or stale information.VGSA is used for monitoring and maintained data copies synchronization.The VGSA is essentially a bitmap and its architecture and location on the diskdepends on the type of the volume group. A logical volume control block (LVCB) contains important information aboutthe logical volume, such as the number of the logical partitions or diskallocation policy. Its architecture and location on the disk depends on the typeof the volume group it belongs to. For standard volume groups, the LVCBresides on the first block of user data within the LV. For big volume groups,there is additional LVCB information in VGDA on the disk. For scalable volumegroups, all relevant logical volume control information is kept in the VGDA aspart of the LVCB information area and the LV entry area.

Specifying the default gateway on a specific interface in HACMP

Specifying the default gateway on a specific interface in HACMP
Specifying the default gateway on a specific interfaceWhen you're using HACMP, you usually have multiple network adapters installed and thus multiple network interface to handle with. If AIX configured the default gateway on a wrong interface (like on your management interface instead of the boot interface), you might want to change this, so network traffic isn't sent over the management interface. Here's how you can do this: First, stop HACMP or do a take-over of the resource groups to another node; this will avoid any problems with applications when you start fiddling with the network configuration. Then open up a virtual terminal window to the host on your HMC. Otherwise you would loose the connection, as soon as you drop the current default gateway. Now you need to determine where your current default gateway is configured. You can do this by typing: lsattr -El inet0 and netstat -nr. The lsattr command will show you the current default gateway route and the netstat command will show you the interface it is configured on. You can also check the ODM: odmget -q"attribute=route" CuAt.Now, delete the default gateway like this:lsattr -El inet0 awk '$2 ~ /hopcount/ { print $2 }' read GWchdev -l inet0 -a delroute=${GW}If you would now use the route command to specifiy the default gateway on a specific interface, like this:route add 0 [ip address of default gateway: xxx.xxx.xxx.254] -if enXYou will have a working entry for the default gateway. But... the route command does not change anything in the ODM. As soon as your system reboots; the default gateway is gone again. Not a good idea.A better solution is to use the chdev command:chdev -l inet0 -a addroute=net,-hopcount,0,,0,[ip address of default gateway]This will set the default gateway to the first interface available.To specify the interface use:chdev -l inet0 -a addroute=net,-hopcount,0,if,enX,,0,[ip address of default gateway]Substitute the correct interface for enX in the command above.If you previously used the route add command, and after that you use chdev to enter the default gateway, then this will fail. You have to delete it first by using route delete 0, and then give the chdev command. Afterwards, check with lsattr -El inet0 and odmget -q"attribute=route" CuAt if the new default gateway is properly configured. And ofcourse, try to ping the IP address of the default gateway and some outside address. Now reboot your system and check if the default gateway remains configured on the correct interface. And startup HACMP again!

Useful HACMP commands

Useful HACMP commandsclstat - show cluster state and substate; needs clinfo. cldump - SNMP-based tool to show cluster state cldisp - similar to cldump, perl script to show cluster state. cltopinfo - list the local view of the cluster topology. clshowsrv -a - list the local view of the cluster subsystems. clfindres (-s) - locate the resource groups and display status. clRGinfo -v - locate the resource groups and display status. clcycle - rotate some of the log files. cl_ping - a cluster ping program with more arguments. clrsh - cluster rsh program that take cluster node names as argument. clgetactivenodes - which nodes are active? get_local_nodename - what is the name of the local node? clconfig - check the HACMP ODM. clRGmove - online/offline or move resource groups. cldare - sync/fix the cluster. cllsgrp - list the resource groups. clsnapshotinfo - create a large snapshot of the hacmp configuration. cllscf - list the network configuration of an hacmp cluster. clshowres - show the resource group configuration. cllsif - show network interface information. cllsres - show short resource group information. lssrc -ls clstrmgrES - list the cluster manager state. lssrc -ls topsvcs - show heartbeat information. cllsnode - list a node centric overview of the hacmp configuration.

HACMP topology

Hacmp can be configured in 3 ways.1. Rotating2. Cascading3. Mutual FailoverThe cascading and rotating resource groups are the “classic”, pre-HA 5.1 types. The new “custom” type of resource group has been introduced in HA 5.1 onwards.Cascading resource group:Upon node failure, a cascading resource group falls over to the available node with the next priority in the node priority list.Upon node reintegration into the cluster, a cascading resource group falls back to its home node by default.Cascading without fallback Thisoption, this means whenever a primary node fails, the package will failover to the next available node in the list and when the primary node comes online then the package will not fallback automatically. We need to move package to its home node at a convenient time.Rotating resource group:This is almost similar to Cascading without fallback, whenever package failover to the standby nodes it will never fallback to the primary node automatically, we need to move it manually at our convenience.Mutual takeover:Mutual takeover option, which means both the nodes in this type are active-active mode. Whenever fail over happens the package on the failed node will move to the other active node and will run with already existing package. Once the failed node comes online we can move the package manually to that node.

AIX Authorization/authentication administration

Authorization/authentication administration ____Report all password inconsistencies and not fix them: pwdck –n ALL ____Report all password inconsistencies and fix them: pwdck –y ALL ____Report all group inconsistencies and not fix them: grpck –n ALL ____Report all group inconsistencies and fix them: grpck –y ALL ____Browse the /etc/shadow, etc/password and /etc/group file weekly SUID/SGID ____Review all SUID/SGID programs owned by root, daemon, and bin. ____Review all SETUID programs: find / -perm -1000 –print ____Review all SETGID programs: find / -perm -2000 –print ____Review all sticky bit programs: find / -perm -3000 –print ____Set user .profile in /etc/security/.profile Permissions structures ____System directories should have 755 permissions at a minimum ____Root system directories should be owned by root ____Use the sticky bit on the /tmp and /usr/tmp directories. ____Run checksum (md5) against all /bin, /usr/bin, /dev and /usr/sbin files. ____Check device file permissions: ____disk, storage, tape, network (should be 600) owned by root. ____tty devices (should be 622) owned by root. ____/dev/null should be 777. ____List all hidden files in there directories ( the .files). ____List all writable directories (use the find command). ____$HOME directories should be 710 ____$HOME .profile or .login files should be 600 or 640. ____Look for un-owned files on the server: find / -nouser –print. Note: Do not remove any /dev files. ____Do not use r-type commands: rsh, rlogin, rcp and tftp or .netrc or .rhosts files. ____Change /etc/host file permissions to 660 and review its contents weekly. ____Check for both tcp/udp failed connections to the servers: netstat –p tcp; netstat –p udp. ____Verify contents of /etc/exports (NFS export file). ____If using ftp, make this change to the /etc/inetd.conf file to enable logging. ftp stream tcp6 nowait root /usr/sbin/ftpd ftpd –l ____Set NFS mounts to –ro (read only) and only to the hosts that they are needed. ____Consider using extended ACL's (please review the tcb man page). ____Before making network connection collect a full system file listing and store it off-line: ls -Ra -la>/tmp/allfiles.system ____Make use of the strings command to check on files: strings /etc/hosts grep Kashmir Recommendations Remove unnecessary services By default the Unix operating system gives us 1024 services to connect to, we want to parse this down to a more manageable value. There are 2 files in particular that we want to parse. The first is the /etc/services file itself. A good starting point is to eliminate all unneeded services and add services as you need them. Below is a screenshot of an existing ntp server etc/services file on one of my lab servers. # # Network services, Internet style # ssh 22/udp ssh 22/tcp mail auth 113/tcp authentication sftp 115/tcp ntp 123/tcp # Network Time Protocol ntp 123/udp # Network Time Protocol # # UNIX specific services # login 513/tcp shell 514/tcp cmd # no passwords used Parse /etc/rc.tcpip file This file starts the daemons that we will be using for the tcp/ip stack on AIX servers. By default the file will start the sendmail, snmp and other daemons. We want to parse this to reflect what functionality we need this server for. Here is the example for my ntp server. # Start up the daemons # echo "Starting tcpip daemons:" trap 'echo "Finished starting tcpip daemons."' 0 # Start up syslog daemon (for error and event logging) start /usr/sbin/syslogd "$src_running" # Start up Portmapper start /usr/sbin/portmap "$src_running" # Start up socket-based daemons start /usr/sbin/inetd "$src_running" # Start up Network Time Protocol (NTP) daemon start /usr/sbin/xntpd "$src_running" This helps also to better understand what processes are running on the server. Remove unauthorized /etc/inittab entries Be aware of what is in the /etc/inittab file on the AIX servers. This file works like the registry in a Microsoft environment. If an intruder wants to hide an automated script, he would want it launched here or in the cron file. Monitor this file closely. Parse /etc/inetd.conf file This is the AIX system file that starts system services, like telnet, ftp, etc. We also want to closely watch this file to see if there are any services that have been enabled without authorization. If you are using ssh for example this is what the inetd.con file should look like. Because we are using other internet connections, this file is not used in my environment and should not be of use to you. This is why ssh should be used for all administrative connections into the environment. It provides an encrypted tunnel so connection traffic is secure. In the case of telnet, it is very trivial to sniff the UID and password. ## protocol. "tcp" and "udp" are interpreted as IPv4. ## ## service socket protocol wait/ user server server program ## name type nowait program arguments ## Edit /etc/rc.net This is network configuration file used by AIX. This is the file you use to set your default network route along your no (for network options) attributes. Because the servers will not be used as routers to forward traffic and we do not want to use loose source routing at you, we will be making a few changes in this file. A lot of them are to protect from DOS and DDOS attacks from the internet. Also protects from ACK and SYN attacks on the internal network. ################################################################## ################################################################## # Changes made on 06/07/02 to tighten up socket states on this # server. ################################################################## if [ -f /usr/sbin/no ] ; then /usr/sbin/no -o udp_pmtu_discover=0 # stops autodiscovery of MTU /usr/sbin/no -o tcp_pmtu_discover=0 # on the network interface /usr/sbin/no -o clean_partial_conns=1 # clears incomplete 3-way conn. /usr/sbin/no -o bcastping=0 # protects against smurf icmp attacks /usr/sbin/no -o directed_broadcast=0 # stops packets to broadcast add. /usr/sbin/no -o ipignoreredirects=1 # prevents loose /usr/sbin/no -o ipsendredirects=0 # source routing /usr/sbin/no -o ipsrcrouterecv=0 # attacks on /usr/sbin/no -o ipsrcrouteforward=0 # our network /usr/sbin/no -o ip6srcrouteforward=0 # from using indirect /usr/sbin/no -o icmpaddressmask=0 # dynamic routes /usr/sbin/no -o nonlocsrcroute=0 # to attack us from /usr/sbin/no -o ipforwarding=0 # Stops server from acting like a router fi Securing root Change the /etc/motd banner This computer system is the private property of XYZ Insurance. It is for authorized use only. All users (authorized or non-authorized) have no explicit or implicit expectations of privacy. Any or all users of this system and all the files on this system may be intercepted, monitored, recorded, copied, audited, inspected and disclosed to XYZ Insurance's management personnel. By using this system, the end user consents to such interception, monitoring, recording, copying, auditing, inspection and disclosure at the discretion of such personnel. Unauthorized or improper use of this system may result in civil and/or criminal penalities and administrative or disciplinary action, as deemed appropriate by said actions. By continuing to use this system, the individual indicates his/her awareness of and consent to these terms and conditions of use. LOG OFF IMMEDIATELY if you do not agree to the provisions stated in this warning banner. Modify /etc/security/user root: loginretries = 5 – failed retries until account locks rlogin = false – Disables remote herald access to a root shell. Need to su from another UID. admgroups = system minage = 0 – minimum aging is no time value maxage = 4 – maximum aging is set to 30 days or 4 weeks umask = 22 Tighten up /etc/security/limits This is an attribute that should be changed due to a runaway resource hog. This orphaned process can grow to use an exorbinate amount of disk space. To provent this we can set the ulimit value here. default: #fsize = 2097151 fsize = 8388604 – sets the soft file block size to a max of 8 Gig. Variable changes in /etc/profile Set the $TMOUT variable in /etc/profile. This will cause an open shell to close after 15 minutes of inactivity. It works in conjunction with the screensaver, to prevent an open session to be used to either delete the server or worse corrupt data on the server. # Automatic logout, include in export line if uncommented TMOUT=900 4.6.5 Sudo is your friend…. This is a nice piece of code that the system administrators can use in order to allow "root-like" functionality. It allows a non-root user to run system binaries or commands. The /etc/sudoers file is used to configure exactly what the user can do. The service is configured and running on ufxcpidev. The developers are running a script called changeperms in order to tag there .ear files with there own ownership attributes. First we setup sudo to allow root-like or superuser doer access to sxnair. # sudoers file. # # This file MUST be edited with the 'visudo' command as root. # # See the sudoers man page for the details on how to write a sudoers file. # # Host alias specification # User alias specification # Cmnd alias specification # User privilege specification root ALL=(ALL) ALL sxnair,jblade,vnaidu ufxcpidev=/bin/chown * /usr/WebSphere/AppServer/installedApps/* # # # Override the built in default settings Defaults syslog=auth Defaults logfile=/var/log/sudo.log For more details, please see the XYZ Company Insurance Work Report that I compiled, or visit this URL: http://www.courtesan.com/sudo/. Tighten user/group attributes Change /etc/security/user These are some of the changes to the /etc/security/user file that will promote a more heightened configuration of default user attributes at your company. default: umask = 077 – defines umask values – 22 is readable only for that UID pwdwarntime = 7 – days of password expiration warnings loginretries = 5 – failed login attempts before account is locked histexpire = 52 – defines how long a password cannot be re-used histsize = 20 – defines how many previous passwords the system remembers minage = 2 – minimum number of weeks a password is valid maxage = 8 – maximum number of weeks a password is valid maxexpired = 4 – maximum time in weeks a password can be changed after it exp

AIX Security Checklist

AIX Security Checklist
AIX Security Checklist AIX Environment Procedures The best way to approach this portion of the checklist is to do a comprehensive physical inventory of the servers. Serial numbers and physical location would be sufficient. ____Record server serial numbers ____Physical location of the servers Next we want to gather a rather comprehensive list of both the AIX and pseries inventories. By running these next 4 scripts we can gather the information for analyze. ____Run these 4 scripts: sysinfo, tcpchk, nfsck and nethwchk. (See Appendix A for scripts) ____sysinfo: ____Determine active logical volume groups on the servers: lsvg -o ____List physical volumes in each volume group: lsvg –p "vgname" ____List logical volumes for each volume group: lsvg –l "vgname" ____List physical volumes information for each hard disk ____lspv hdiskx ____lspv –p hdiskx ____lspv –l hdiskx ____List server software inventory: lslpp -L ____List server software history: lslpp –h ____List all hardware attached to the server: lsdev –C sort –d ____List system name, nodename, LAN network number, AIX release, AIX version and machine ID: uname –x ____List all system resources on the server: lssrc –a ____List inetd services: lssrc –t 'service name' –p 'process id' ____List all host entries on the servers: hostent -S ____Name all nameservers the servers have access to: namerslv –Is ____Show status of all configured interfaces on the server: netstat –i ____Show network addresses and routing tables: netstat –nr ____Show interface settings: ifconfig ____Check user and group system variables ____Check users: usrck –t ALL ____Check groups: grpck –t ALL ____Run tcbck to verify if it is enabled: tcbck ____Examine the AIX failed logins: who –s /etc/security/failedlogin ____Examine the AIX user log: who /var/adm/wtmp ____Examine the processes from users logged into the servers: who –p /var/adm/wtmp ____List all user attributes: lsuser ALL sort –d ____List all group attributes: lsgroup ALL ____tcpchk: ____Confirm the tcp subsystem installed: lslpp –l grep bos.net ____Determine if it is running: lssrc –g tcpip ____Search for .rhosts and .netrc files: find / -name .rhosts -print ; find / -name .netrc –print ____Checks for rsh functionality on host: cat /etc/hosts.equiv ____Checks for remote printing capability: cat /etc/hosts.lpd grep v # ____nfschk: ____Verify NFS is installed: lslpp -L bin/grep nfs ____Check NFS/NIS status: lssrc -g nfs bin/grep active ____Checks to see if it is an NFS server and what directories are exported: cat /etc/xtab ____Show hosts that export NFS directories: showmount ____Show what directories are exported: showmount –e ____nethwchk ____Show network interfaces that are connected: lsdev –Cc if ____Display active connection on boot: odmget -q value=up CuAt grep namecut -c10-12 ___Show all interface status: ifconfig ALL Root level access ____Limit users who can su to another UID: lsuser –f ALL ____Audit the sulog: cat /var/adm/sulog ____Verify /etc/profile does not include current directory ____Lock down cron access ____To allow root only: rm –i /var/adm/cron/cron.deny and rm –I /var/adm/cron/cron.allow ____To allow all users: touch cron.allow (if file does not already exist) ____To allow a user access: touch /var/adm/cron/cron.allow then echo "UID">/var/adm/cron/cron.allow ____To deny a user access: touch /var/adm/cron/cron.deny then echo "UID">/var/adm/cron/cron.deny ____Disable direct herald root access: add rlogin=false to root in /etc/security/user file or through smit ____Limit the $PATH variable in /etc/environment. Use the users .profile instead.

HACMP ADAPTERS

As stated above, each network defined to HACMP should have at least two adapters per node. While it ispossible to build a cluster with fewer, the reaction to adapter failures is more severe: the resource groupmust be moved to another node. AIX provides support for Etherchannel, a facility that can used to aggregateadapters (increase bandwidth) and provide network resilience. Etherchannel is particularly useful forfast responses to adapter / switch failures. This must be set up with some care in an HACMP cluster.When done properly, this provides the highest level of availability against adapter failure. Refer to the IBMtechdocs website: http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101785 for furtherdetails.Many System p TM servers contain built-in Ethernet adapters. If the nodes are physically close together, itis possible to use the built-in Ethernet adapters on two nodes and a "cross-over" Ethernet cable (sometimesreferred to as a "data transfer" cable) to build an inexpensive Ethernet network between two nodes forheart beating. Note that this is not a substitute for a non-IP network.Some adapters provide multiple ports. One port on such an adapter should not be used to back up anotherport on that adapter, since the adapter card itself is a common point of failure. The same thing is trueof the built-in Ethernet adapters in most System p servers and currently available blades: the ports have acommon adapter. When the built-in Ethernet adapter can be used, best practice is to provide an additionaladapter in the node, with the two backing up each other.Be aware of network detection settings for the cluster and consider tuning these values. In HACMP terms,these are referred to as NIM values. There are four settings per network type which can be used : slow,normal, fast and custom. With the default setting of normal for a standard Ethernet network, the networkfailure detection time would be approximately 20 seconds. With todays switched network technology thisis a large amount of time. By switching to a fast setting the detection time would be reduced by 50% (10seconds) which in most cases would be more acceptable. Be careful however, when using custom settings,as setting these values too low can cause false takeovers to occur. These settings can be viewed using a varietyof techniques including : lssrc –ls topsvcs command (from a node which is active) or odmgetHACMPnim grep –p ether and smitty hacmp.ApplicationsThe most important part of making an application run well in an HACMP cluster is understanding theapplication's requirements. This is particularly important when designing the Resource Group policy behaviorand dependencies. For high availability to be achieved, the application must have the ability tostop and start cleanly and not explicitly prompt for interactive input. Some applications tend to bond to aparticular OS characteristic such as a uname, serial number or IP address. In most situations, these problemscan be overcome. The vast majority of commercial software products which run under AIX are wellsuited to be clustered with HACMP.Application Data LocationWhere should application binaries and configuration data reside? There are many arguments to this discussion.Generally, keep all the application binaries and data were possible on the shared disk, as it is easyto forget to update it on all cluster nodes when it changes. This can prevent the application from starting orworking correctly, when it is run on a backup node. However, the correct answer is not fixed. Many applicationvendors have suggestions on how to set up the applications in a cluster, but these are recommendations.Just when it seems to be clear cut as to how to implement an application, someone thinks of a newset of circumstances. Here are some rules of thumb:If the application is packaged in LPP format, it is usually installed on the local file systems in rootvg. Thisbehavior can be overcome, by bffcreate’ing the packages to disk and restoring them with the preview option.This action will show the install paths, then symbolic links can be created prior to install which pointto the shared storage area. If the application is to be used on multiple nodes with different data or configuration,then the application and configuration data would probably be on local disks and the data sets onshared disk with application scripts altering the configuration files during fallover. Also, remember theHACMP File Collections facility can be used to keep the relevant configuration files in sync across the cluster.This is particularly useful for applications which are installed locally.Start/Stop ScriptsApplication start scripts should not assume the status of the environment. Intelligent programming shouldcorrect any irregular conditions that may occur. The cluster manager spawns theses scripts off in a separatejob in the background and carries on processing. Some things a start script should do are: First, check that the application is not currently running! This is especially crucial for v5.4 users asresource groups can be placed into an unmanaged state (forced down action, in previous versions).Using the default startup options, HACMP will rerun the application start script which may causeproblems if the application is actually running. A simple and effective solution is to check the stateof the application on startup. If the application is found to be running just simply end the start scriptwith exit 0.Verify the environment. Are all the disks, file systems, and IP labels available?If different commands are to be run on different nodes, store the executing HOSTNAME to variable.Check the state of the data. Does it require recovery? Always assume the data is in an unknown statesince the conditions that occurred to cause the takeover cannot be assumed.Are there prerequisite services that must be running? Is it feasible to start all prerequisite servicesfrom within the start script? Is there an inter-resource group dependency or resource group sequencingthat can guarantee the previous resource group has started correctly? HACMP v5.2 and later hasfacilities to implement checks on resource group dependencies including collocation rules inHACMP v5.3.Finally, when the environment looks right, start the application. If the environment is not correct anderror recovery procedures cannot fix the problem, ensure there are adequate alerts (email, SMS,SMTP traps etc) sent out via the network to the appropriate support administrators.Stop scripts are different from start scripts in that most applications have a documented start-up routineand not necessarily a stop routine. The assumption is once the application is started why stop it? Relyingon a failure of a node to stop an application will be effective, but to use some of the more advanced featuresof HACMP the requirement exists to stop an application cleanly. Some of the issues to avoid are:Be sure to terminate any child or spawned processes that may be using the disk resources. Considerimplementing child resource groups.Verify that the application is stopped to the point that the file system is free to be unmounted. Thefuser command may be used to verify that the file system is free.In some cases it may be necessary to double check that the application vendor’s stop script did actuallystop all the processes, and occasionally it may be necessary to forcibly terminate some processes.Clearly the goal is to return the machine to the state it was in before the application start script was run.Failure to exit the stop script with a zero return code as this will stop cluster processing. * Note: This is not the case with start scripts!Remember, most vendor stop/starts scripts are not designed to be cluster proof! A useful tip is to have stopand start script verbosely output using the same format to the /tmp/hacmp.out file. This can be achievedby including the following line in the header of the script: set -x && PS4="${0##*/}"'[$LINENO]

HACMP Basics

HACMP Basics
HACMP BasicsHistoryIBM's HACMP exists for almost 15 years. It's not actually an IBM product, they bought it from CLAM, which was later renamed to Availant and is now called LakeViewTech. Until august 2006, all development of HACMP was done by CLAM. Nowadays IBM does it's own development of HACMP in Austin, Poughkeepsie and BangaloreIBM's high availability solution for AIX, High Availability Cluster Multi Processing (HACMP), consists of two components:•High Availability: The process of ensuring an application is available for use through the use of duplicated and/or shared resources (eliminating Single Points Of Failure – SPOF's).Cluster Multi-Processing: Multiple applications running on the same nodes with shared or concurrent access to the data.A high availability solution based on HACMP provides automated failure detection, diagnosis, application recovery and node reintegration. With an appropriate application, HACMP can also provide concurrent access to the data for parallel processing applications, thus offering excellent horizontal scalability.What needs to be protected? Ultimately, the goal of any IT solution in a critical environment is to provide continuous service and data protection.The High Availability is just one building block in achieving the continuous operation goal. The High Availability is based on the availability hardware, software (OS and its components), application and network components.The main objective of the HACMP is to eliminate Single Points of Failure (SPOF's) “…A fundamental design goal of (successful) cluster design is the elimination of single points of failure (SPOFs)…”Eliminate Single Point of Failure (SPOF) Cluster Eliminated as a single point of failureNode Using multiple nodesPower Source Using Multiple circuits or uninterruptibleNetwork/adapter Using redundant network adaptersNetwork Using multiple networks to connect nodes.TCP/IP Subsystem Using non-IP networks to connect adjoining nodes & clientsDisk adapter Using redundant disk adapter or multiple adaptersDisk Using multiple disks with mirroring or RAIDApplication Add node for takeover; configure application monitorAdministrator Add backup or every very detailed operations guideSite Add additional site.Cluster ComponentsHere are the recommended practices for important cluster components.NodesHACMP supports clusters of up to 32 nodes, with any combination of active and standby nodes. While itis possible to have all nodes in the cluster running applications (a configuration referred to as "mutualtakeover"), the most reliable and available clusters have at least one standby node - one node that is normallynot running any applications, but is available to take them over in the event of a failure on an active node.Additionally, it is important to pay attention to environmental considerations. Nodes should not have acommon power supply - which may happen if they are placed in a single rack. Similarly, building a clusterof nodes that are actually logical partitions (LPARs) with a single footprint is useful as a test cluster, butshould not be considered for availability of production applications.Nodes should be chosen that have sufficient I/O slots to install redundant network and disk adapters.That is, twice as many slots as would be required for single node operation. This naturally suggests thatprocessors with small numbers of slots should be avoided. Use of nodes without redundant adaptersshould not be considered best practice. Blades are an outstanding example of this. And, just as every clusterresource should have a backup, the root volume group in each node should be mirrored, or be on aRAID device.Nodes should also be chosen so that when the production applications are run at peak load, there are stillsufficient CPU cycles and I/O bandwidth to allow HACMP to operate. The production applicationshould be carefully benchmarked (preferable) or modeled (if benchmarking is not feasible) and nodes chosenso that they will not exceed 85% busy, even under the heaviest expected load.Note that the takeover node should be sized to accommodate all possible workloads: if there is a singlestandby backing up multiple primaries, it must be capable of servicing multiple workloads. On hardwarethat supports dynamic LPAR operations, HACMP can be configured to allocate processors and memory toa takeover node before applications are started. However, these resources must actually be available, oracquirable through Capacity Upgrade on Demand. The worst case situation – e.g., all the applications ona single node – must be understood and planned for.NetworksHACMP is a network centric application. HACMP networks not only provide client access to the applicationsbut are used to detect and diagnose node, network and adapter failures. To do this, HACMP usesRSCT which sends heartbeats (UDP packets) over ALL defined networks. By gathering heartbeat informationon multiple nodes, HACMP can determine what type of failure has occurred and initiate the appropriaterecovery action. Being able to distinguish between certain failures, for example the failure of a networkand the failure of a node, requires a second network! Although this additional network can be “IPbased” it is possible that the entire IP subsystem could fail within a given node. Therefore, in additionthere should be at least one, ideally two, non-IP networks. Failure to implement a non-IP network can potentiallylead to a Partitioned cluster, sometimes referred to as 'Split Brain' Syndrome. This situation canoccur if the IP network(s) between nodes becomes severed or in some cases congested. Since each node isin fact, still very alive, HACMP would conclude the other nodes are down and initiate a takeover. Aftertakeover has occurred the application(s) potentially could be running simultaneously on both nodes. If theshared disks are also online to both nodes, then the result could lead to data divergence (massive data corruption).This is a situation which must be avoided at all costs.The most convenient way of configuring non-IP networks is to use Disk Heartbeating as it removes theproblems of distance with rs232 serial networks. Disk heartbeat networks only require a small disk orLUN. Be careful not to put application data on these disks. Although, it is possible to do so, you don't wantany conflict with the disk heartbeat mechanism!

HACMP log files

HACMP log files
/usr/sbin/cluster/etc/rhosts --- to accept incoming communication from clcomdES (cluster communucation enahanced security)/usr/es/sbin/cluster/etc/rhostsNote: If there is an unresolvable label in the /usr/es/sbin/cluster/etc/rhosts file, then all clcomdES connections from remote nodes will be denied.cluster manager clstrmgrEScluster lock Daemon (clockdES)cluster multi peer extension communication daemon (clsmuxpdES)The clcomdES is used for cluster configuration operations such as cluster synchronisationcluster management (C-SPoC) * Dynamic re-configuration DARE configuration. (DARE ) operation.For clcomdES there should be atleast 20 MB free space in /var file system./var/hacmp/clcomd/clcomd.log --it requires 2 MB/var/hacmp/clcomd/clcomdiag.log --it requires 18MBAdditional 1 MB required for /var/hacmp/odmcache directoryclverfify.log also present in /var directory /var/hacmp/clverify/current//* contains log for mcurrent execution of clverify/var/hacmp/clverify/pass//* contains logs from the last passed verification/var/hacmp/clverify/pass.prev//* contains log from the second last passed verification

HA failover scenarios1

Graceful For graceful failover, you can run “smitty clstop” then select graceful option. This will not change anything except stopping the cluster on that node.Note: If you stop the cluster, check the status using lssrc –g cluster, sometimes clstrmgrES daemon will take long time to stop, DO NOT KILL THIS DAEMON.It will stop automatically after a while.You can do this on both the nodes2. TakeoverFor takeover, run “smitty clstop” with takeover option, this will stop the cluster on that node and the standby node will take over the pakageYou can do this on both the nodes3. Soft Pakckage Failover Run smitty cm_hacmp_resource_group_and_application_management_menu >>>Move a Resource Group to Another Node >>>>select the package name and node name >>>enterThis will move the package from that node to the node that you have selected in the above menu. This method will give lot of troubles in HA 4.5 whereas it runs good on HA 5.2 unless we have any apps startup issues.You can do this on both the nodes│ 4. Failover Network Adapter(s):For this type of testing , run “ifconfig enx down” , then package IP will failover to primary adapter. You can not even see any outage or anything.We can manually (ifconfig enx up) bring it back to original adapter , but better to reboot the server to bring the package back to the original node5. Hardware Failure (crash):This is a standard type of testing; run the command “reboot –q” then the node will godown without stopping any apps and come up immediately. The package will failover to the standby node with in 2 min os downtime (Even tough HA failover is fast, some apps will take long time to start