wiki.getshifting.com

--- Sjoerd Hooft's InFormation Technology ---

User Tools

Site Tools


start

SHIFT-WIKI - Sjoerd Hooft's InFormation Technology

This WIKI is my personal documentation blog. Please enjoy it and feel free to reach out through blue sky if you have a question, remark, improvement or observation. See below for the latest additions, or use the search or tags to browse for content.


Df And Du Show Different Space Usage

Summary: What to do when df and du show different space usage.
Date: Around 2011
Refactor: 21 February 2025: Checked links and formatting.

→ Read more...

2025/06/01 11:59

Determining Number Of VMs Per VMFS Volume

Summary: How to determine how many VMs you can run on your vmware volumes.
Date: Around 2009
Refactor: 21 February 2025: Checked links and formatting.

This article explains the following article in more detail:

Yellow Bricks article: Max amount of VMs per VMFS volume

The reason for this is that both articles are quite technical and can be quite confusing. I have a lot of experience with storage and I would say a little bit more than the average system administrator but not like the guys who work with NetApp and EMC for 20 years. I thought I'd write an article on going through the articles step by step, excluding the steps that are not relevant for my environment, providing an extensive example to use for future references.

My environment:

  • ESX 4.1 update 1
  • VMs are for 80% Windows Server 2003 and 20% Windows Server 2008
  • Storage is IBM NSeries N6060 which is a rebranded NetApp FAS3160
  • Storage link is 4GB Fiber Channel
  • Disks are Fiber Channel type

Excluded step: I'm not taking SCSI reservations into considerations. See the quote below on what kind of operations cause SCSI reservations. In my environment I do not expect these kind of operations to happen on a daily bases:

VMFS is a clustered file system and uses SCSI reservations as part of its distributed locking algorithms. Administrative operations, such as creating or deleting a virtual disk, extending a VMFS volume, or creating or deleting snapshots, result in metadata updates to the file system using locks, and thus result in SCSI reservations. Reservations are also generated when you expand a virtual disk for a virtual machine with a snapshot. A reservation causes the LUN to be available exclusively to a single ESX host for a brief period of time. Although it is acceptable practice to perform a limited number of administrative tasks during peak hours, it is preferable to postpone major maintenance or configuration tasks to off-peak hours in order to minimize the impact on virtual machine performance.

Remember that changing settings and or defaults could do more harm than good when you're not properly analyzing your environment. Default values are there for a reason.

To come to an actual number of Virtual Machines per VMFS volume we have to gather data first. Data gathering consists of two parts:

  1. Performance and settings data gathering
  2. Capacity data gathering

→ Read more...

2025/06/01 11:59

Script: PowerCLI: Deploy Storage to vSphere

Summary: A script to configure storage in vSphere.
Date: 29 December 2011
Refactor: 21 February 2025: Checked links and formatting.

I'm actually quite proud on this script. It will create a volume, LUNs, connects the LUNs to the required esx hosts and creates a datastore. On top of that it also sets the MPIO settings correct on the esx host:

########################################################################################################################
# Author : Sjoerd Hooft
# Date Initial Version: 29 December 2011
# Comments: sjoerd_warmetal_nl
#
# Description:
# This script creates a volume per vSphere cluster, then creates LUNs, then connects the newly created lun to the esx hosts inside the cluster and creates the datastores on them.
# Make sure to modify the variables for your environment
#
# Credits:
# The idea for this script is partly from https://communities.netapp.com/docs/DOC-6181, which roughly does the same for NFS volumes.
#
# Recommendations:
# This script must run from the PowerCLI.
#
# Changes:
# Please comment on your changes to the script (your name and email address, line number, description):
# DATE - USERNAME - EMAILADDRESS - CHANGE DESCRIPTION
########################################################################################################################
 
# Script Variables
$timestamp = Get-Date -format "yyyyMMdd-HH.mm"
 
# Getting credentials for vCenter
while ($vcentercredentials -eq $null) {
  Write-Host (get-date -uformat %I:%M:%S) "Please provide authentication credentials for vCenter" -ForegroundColor Green;
  $vcentercredentials = $Host.UI.PromptForCredential("Please enter credentials", "Enter vCenter credentials", "intranet\adminshf", "")
}
 
# Getting credentials for NetApp filer
while ($filercredentials -eq $null) {
  Write-Host (get-date -uformat %I:%M:%S) "Please provide authentication credentials for NetApp filer" -ForegroundColor Green;
  $filercredentials = $Host.UI.PromptForCredential("Please enter credentials", "Enter NetApp filer credentials", "root", "")
}
 
# Setting variables vSphere
$vCenter = "vcenter"
$cluster = "BCP Test"
 
# Setting variables NetApp
$filer = "netappfiler01"
$aggr = "aggr1"
$newvol = "ESX_BCPTEST"
$numberofluns = 2
$lunname = "BCP_"
$lunsize = 4
$size = "g"
$igroup = "ESX_BCP_Hosts"
$startlunid = 201
 
# Import DATA ONTAP Module
Import-module D:\sjoerd\DataONTAP
 
# Connect to vCenter and NetApp Filer
Connect-VIServer $vCenter -Credential $vcentercredentials | Out-Null
Connect-NaController $Filer -Credential $filercredentials | Out-Null
 
# Create a new volume
$tmpvolsize = (($numberofluns * $lunsize) + 1)
$volsize = [string]$tmpvolsize += $size
Write-Host (get-date -uformat %I:%M:%S) "Creating volume $newvol in aggregate $aggr with size $volsize." -ForegroundColor Green;
New-NaVol $newvol $aggr $volsize | Out-Null
 
# Set options for the new volume
Set-NaVolOption $newvol no_atime_update yes | Out-Null
Set-NaVolOption $newvol fractional_reserve 0 | Out-Null
Set-NaSnapshotreserve $newvol 0 | Out-Null
Set-NaSnapshotSchedule $newvol -Weeks 0 -Days 0 -Hours 0 -WhichHours 0 | Out-Null
 
# Creating LUNs
$luncounter = 0
$lunid = $startlunid
$fulllunsize = [string]$lunsize += $size
$esxhost = Get-Cluster $cluster | Get-VMHost | sort | Select-Object -First 1
while ($luncounter -le $numberofluns-1) {
  $lunname | foreach{
      Write-Host (get-date -uformat %I:%M:%S) "Creating lun /vol/$newvol/$_$lunid with size $fulllunsize. " -ForegroundColor Green;
	  New-NaLun /vol/"$newvol"/"$_$lunid" -size $fulllunsize -Type vmware | Out-Null
	  Set-NaLunComment /vol/"$newvol"/"$_$lunid" "LUN for ESX Cluster $cluster" | Out-Null
	  Add-NaLunMap /vol/"$newvol"/"$_$lunid" -InitiatorGroup $igroup -ID $lunid | Out-Null
	  Get-VMHostStorage $esxhost -RescanAllHba -RescanVmfs | Out-Null
	  $path = Get-ScsiLun -vmhost $esxhost -luntype disk | where { $_.runtimename -match $lunid }
	  Write-Host (get-date -uformat %I:%M:%S) "Creating datastore $lunname$lunid on path $path on host $esxhost." -ForegroundColor Green;
	  New-Datastore -vmhost $esxhost -Name "$lunname$lunid" -path $path -vmfs | Out-Null
	  }
  $luncounter++
  $lunid++
}
 
# Rescan storage and datastores on all hosts in the cluster and set MPIO settings for each host.
$esxhosts = Get-Cluster $cluster | Get-VMHost
$rootpasswordsecure = Read-Host -assecurestring "Please enter the password of the root user to configure the MPIO settings on each host"
Function ConvertToPlainText( [security.securestring]$rootpasswordsecure ) {
$marshal = [Runtime.InteropServices.Marshal]
$marshal::PtrToStringAuto( $marshal::SecureStringToBSTR($rootpasswordsecure) ) }
$rootpassword = ConvertToPlainText $rootpasswordsecure
ForEach ($esxhost in $esxhosts) {
   Write-Host (get-date -uformat %I:%M:%S) "Scanning storage on host $esxhost" -ForegroundColor Green;
   Get-VMHostStorage $esxhost -RescanAllHba -RescanVmfs | Out-Null
   Write-Host (get-date -uformat %I:%M:%S) "Setting MPIO on host $esxhost. Logging data to D:\adminshf\mpath_$esxhost" -ForegroundColor Green;
   # Checking for the rsa key to be in cache
   D:\adminshf\plink -ssh $esxhost -l root -pw $rootpassword echo "Connection to $esxhost succesful"
   D:\adminshf\plink -ssh $esxhost -l root -pw $rootpassword /opt/ontap/santools/config_mpath --primary --loadbalance --policy fixed > D:\adminshf\mpath_"$esxhost"_"$timestamp"
   Write-Host (get-date -uformat %I:%M:%S) "Done on host $esxhost" -ForegroundColor Green;
}

This wiki has been made possible by:

2025/06/01 11:59

Script: Bash: AIX: Daily Check Script

Summary: A daily script to check all sort of stuff on AIX.
Date: 27 December 2010
Refactor: 21 February 2025: Checked links and formatting.

#!/bin/bash
########################################################################################################################
# Author : Sjoerd Hooft
# Date Initial Version: 27 Dec 2010
# Comments: sjoerd_@_warmetal_nl
#
# Description:
# This is a sample script to perform the daily checks on AIX servers.
#
# Recommendations:
# The script is designed for a 120 column terminal.
# The running user must be able to do a passwordless sudo to root.
#
# Changes:
# Please comment on your changes to the script (your name and email address, line number, description):
########################################################################################################################
 
# Script Variables
HOSTNAME_SHORT=`hostname -s`
AUTOMATIC=0
BASEDIR=`dirname $0`
LOGFILE="$BASEDIR/dc.log"
WHATAMI=`basename $0`
DATE=`date +%Y%m%d`
TOMAIL=sjoerd_@_warmetal_nl
BOLD=`tput bold`
BOLDOFF=`tput sgr0`
 
# Directories
APPDIR="/var/log/APP"
WASDIR="/opt/WAS_Profiles/AppSrv/logs"
FILE3DIR="/var/data/FILE3"
FILE1DIR="/var/data/FILE1/log"
JMSDIR="/var/data/app/jms_errors"
TOMCATDIR="/var/log/app"
 
# Oracle Variables
ORACLE_HOME="/opt/oracle/product/10.2"
ORACLE_BASE="/opt/oracle"
ORACLE_SID_DB1=db1
ORACLE_SID_DB2=db2
export ORACLE_HOME ORACLE_BASE
 
# Function to pause the script
# The operator can evaluate the outcome of the previous function
scriptContinue () {
   if [ "$AUTOMATIC" == "0" ]; then
      echo "Press ENTER to continue"
      read CONTINUE
      clear
   fi
}
 
# Function that will list the AIX internal errors
checkErrors () {
   echo "$BOLD Listing the Error Logging Facility: $BOLDOFF"
   errpt
   echo
}
 
# Function that will clear all AIX internal errors
clearErrors () {
   echo "Clearing Errors"
   sudo errclear 0
}
 
# Function that will let the operator view the AIX internal errors in detail
viewErrors () {
   echo "Viewing Errors"
   errpt -a | less
}
 
# Function that will remove all files from the protected directory that holds JMS/MQ errors
removeJms () {
   echo "Are you sure you want to remove the JMS error files from $JMSDIR? "
   echo "If you hesitate, press CTRL+C to exit the script. "
   scriptContinue
   echo "Removing these files: "
   echo $JMSDIR/*
   sudo rm $JMSDIR/*
   echo
   echo "Done"
   echo
}
 
# Function that will check the last 4 logfiles from 4 different applications
# This is possible with multiple for loops since the files are named similar
# Known errors are being skipped
# It will show only the last 10 entries per logfile
checkLog-abs () {
   echo "$BOLD Checking abs-logs in $APPDIR $BOLDOFF "
   echo "Note: we check the last 4 logfiles and skip any known error, and limit the amount of lines to 10."
   for application in appserver1 appserver2 appserver3; do
      for logfile in app.log.4 app.log.3 app.log.2 app.log; do
         echo "Checking $BOLD $application-$logfile $BOLDOFF "
         cat $ABSDIR/$application-$logfile | grep ERROR | \
            grep -v 'LDAP: error code 32 - No Such Object' | \
            grep -v 'doRefreshProposalsResponse didn.d send the email caught - ignoring' | \
            grep -v 'Error getting active tan: No TAN available for user' | \
            grep -v 'CORBA OBJECT_NOT_EXIST' | \
            tail -10
         echo
      done
      scriptContinue
      clear
   done
   echo
}
 
# Function to check the SystemOut.log from the websphere applications
# Known errors are being skipped
# It will show only the last 10 entries per logfile
checkLog-was () {
   echo "$BOLD Checking websphere logs in $WASDIR $BOLDOFF "
   for server in server1 server2 server3 server4; do
      echo "Checking $BOLD ${server}_Server/SystemOut.log $BOLDOFF "
      cat $WASDIR/${server}_Server/SystemOut.log | grep -i error | \
         grep -v 'oracle.jdbc.driver.DatabaseError.throwSqlException' | \
         grep -v 'The Network Adapter could not establish the connectionDSRA0010E' | \
         grep -v 'Error creating XA Connection and Resource com.ibm.ws.exception.WsException: DSRA8100E' | \
         grep -v 'Error creating XA Connection and Resource java.security.PrivilegedActionException:' | \
         tail -10
      echo
      scriptContinue
   done
   echo
}
 
# Function to check whether files have been processed.
# They will have a different extention.
checkFiles-host3 () {
   echo "$BOLD Checking the process on $HOSTNAME_SHORT $BOLDOFF "
   echo "There should be no files ending on .txt older than one hour:"
   echo "Last 10 files ending on .txt in $FILE3DIR:"
   ls -ltr $FILE3DIR | grep '\.txt$' | tail -10
   echo
   echo "$BOLD Checking the process on $HOSTNAME_SHORT $BOLDOFF "
   echo "There should be recent (last 24 hours) files:"
   echo "Last 10 files in in $FILE3DIR:"
   ls -ltr $FILE3DIR | grep '\.txt' |  tail -10
   echo
   scriptContinue
}
 
# Function that will check whether error files exist
# It will allow the operator, after examining the size, to delete them
# Continue works only in this menu structure because this is the last check for this host
checkFiles-host1 () {
   echo "$BOLD Checking MQ process error files on $HOSTNAME_SHORT $BOLDOFF "
   echo "Checking for jms (MQ) errors in $JMSDIR, there should be no files in this directory:"
   ls -ltr $JMSDIR
   if [ $AUTOMATIC == 0 ]; then
      JMSACTION=`ls -ltr $JMSDIR | wc -l`
      if [ $JMSACTION -gt 1 ]; then
         echo
         echo "${BOLD}There are files in this directory!$BOLDOFF If all files are really small ( < 100 bytes ) you can delete them. "
         echo "   Would you like to do that right now?"
         echo
         echo "remove            - remove all files in $JMSDIR"
         echo "continue          - continue with dailycheck"
         echo
         menuChoice
      fi
   scriptContinue
   else
      echo "AUTOMATIC mode is on. If there are any files run the script manually on $HOSTNAME_SHORT "
   fi
}
 
# Function to check Oracle logfile bdump for errors
# It will show the line with the error, as well as the 2 lines before and after
# It will show only the last 10 entries per logfile
checkLog-ora () {
   ORALOGDIR="/var/log/oracle/10.2/${ORACLE_SID}/bdump"
   echo "$BOLD Checking the Oracle logfile $ORALOGDIR/alert_$ORACLE_SID.log $BOLDOFF "
   echo "The last 10 ORA- messages are displayed, including the 2 lines before and the two lines after "
   sudo cat $ORALOGDIR/alert_$ORACLE_SID.log | sed -e '
      1{$!N;$d;}
      $!N;/ORA-/!D
      $!N;$d;N;p
      g;$!N;$d;N;D
      '| tail -10
   echo
   scriptContinue
}
 
# Function to check tomcat application servers for errors
# It will evaluate all logfiles created the last four days
# Known errors are being skipped
checkLog-tomcat () {
   echo "$BOLD Checking the tomcat application server logs on $HOSTNAME_SHORT $BOLDOFF "
   echo "$BOLD Checking Tomcat logfiles: $BOLDOFF"
   echo "Checking the last four days of $TOMCATDIR/applicaton.log files"
   find $TOMCATDIR/app/. -type f -name 'application*' -mtime -3 -print -exec cat {} \; | grep ERROR
   echo
   echo "Checking the last four days of $TOMCATDIR/framework.log files"
   find $TOMCATDIR/app/. -type f -name 'framework*' -mtime -3 -print -exec cat {} \; | grep ERROR
   echo
   scriptContinue
}
 
# Function to expand the options handling AIX system errors
actionErrors () {
   menuStart
   checkErrors
   echo "Note: The system clears all hardware errors automatically after 90 days, and all other errors after 30 days."
   echo
   echo "clearerrors       - clear all errors now"
   echo "viewerrors        - review errors in less"
}
 
# Function to specify which host the script runs on
# Declare host specific variables
# Set the actions to be taken
hostSpecific () {
   clear
   if [ "$HOSTNAME_SHORT" == "host1" ]; then
      checkLog-abs
      checkLog-was
      checkFiles-host1
   fi
   if [ "$HOSTNAME_SHORT" == "host2" ]; then
      export ORACLE_SID=$ORACLE_SID_DB2
      checkLog-ora
   fi
   if [ "$HOSTNAME_SHORT" == "host3" ]; then
      export ORACLE_SID=$ORACLE_SID_DB1
      checkLog-ora
      checkLog-tomcat
      checkFiles-host3
   fi
}
 
# Function to clear the screen and give the idea of a pretty script
menuStart () {
   clear
   echo "########################################################################################################################"
   echo "################################################### Daily Check Menu ###################################################"
   echo
}
 
# Function to show the operator the default menu options
menuEnd () {
   echo
   echo "errors            - take further actions regarding errors"
   echo "host              - start host specific checks"
   echo "auto              - restarts the script and runs it automatically, after which the logfile is mailed to $TOMAIL "
   echo "                  - this also works from the commandline: $WHATAMI auto "
   echo
   echo "exit              - exit"
   echo
}
 
# Function to read the menu option from the operator
# This menu is used for all required menus in the script
menuChoice () {
   echo "Enter menu choice: [exit]"
   read MENUCHOICE
 
   if [ -z "$MENUCHOICE" ]; then
   MENUCHOICE="exit"
   fi
 
   case $MENUCHOICE in
 
   errors )
      actionErrors
      menuChoice
   ;;
 
   host )
      hostSpecific
   ;;
 
   clearerrors )
      clearErrors
   ;;
 
   viewerrors )
      viewErrors
   ;;
 
   auto )
      $BASEDIR/$WHATAMI auto
      exit
   ;;
 
   exit )
      exit
   ;;
 
   remove )
      removeJms
   ;;
 
   continue )
      echo
   ;;
 
   * )
      echo "Wrong Input"
      menuChoice
   ;;
 
   esac
}
 
# Function to mail the log when the script has run automatically
mailLog () {
   cat $LOGFILE | mail -s "Report $WHATAMI on $HOSTNAME_SHORT of $DATE" $TOMAIL
}
 
# Function to determine whether the script should run automatically
# Set the automatic variables to send the output to a logfile instead of a screen
# and make the logfile readable by removing bold text markers
# It also makes sure the logfile gets mailed
if [ "$1" == "auto" ]; then
   AUTOMATIC=1
   BOLD=
   BOLDOFF=
   exec > $LOGFILE 2>&1
   checkErrors
   hostSpecific
   mailLog
   exit
fi
 
# Actual script:
# Infinite while loop, as long the script is not exited,
# start the menu, check for errors and ask the operator what to do
while :
do
   menuStart
   checkErrors
   menuEnd
   menuChoice
done

This wiki has been made possible by:

2025/06/01 11:59

<< Newer entries | Older entries >>

This wiki has been made possible by:

start.txt · Last modified: by sjoerd