SHIFT-WIKI - Sjoerd Hooft's InFormation Technology
This WIKI is my personal documentation blog. Please enjoy it and feel free to reach out through blue sky if you have a question, remark, improvement or observation. See below for the latest additions, or use the search or tags to browse for content.
Df And Du Show Different Space Usage
Summary: What to do when df and du show different space usage.
Date: Around 2011
Refactor: 21 February 2025: Checked links and formatting.
Determining Number Of VMs Per VMFS Volume
Summary: How to determine how many VMs you can run on your vmware volumes.
Date: Around 2009
Refactor: 21 February 2025: Checked links and formatting.
This article explains the following article in more detail:
Yellow Bricks article: Max amount of VMs per VMFS volume
The reason for this is that both articles are quite technical and can be quite confusing. I have a lot of experience with storage and I would say a little bit more than the average system administrator but not like the guys who work with NetApp and EMC for 20 years. I thought I'd write an article on going through the articles step by step, excluding the steps that are not relevant for my environment, providing an extensive example to use for future references.
My environment:
- ESX 4.1 update 1
- VMs are for 80% Windows Server 2003 and 20% Windows Server 2008
- Storage is IBM NSeries N6060 which is a rebranded NetApp FAS3160
- Storage link is 4GB Fiber Channel
- Disks are Fiber Channel type
Excluded step: I'm not taking SCSI reservations into considerations. See the quote below on what kind of operations cause SCSI reservations. In my environment I do not expect these kind of operations to happen on a daily bases:
VMFS is a clustered file system and uses SCSI reservations as part of its distributed locking algorithms. Administrative operations, such as creating or deleting a virtual disk, extending a VMFS volume, or creating or deleting snapshots, result in metadata updates to the file system using locks, and thus result in SCSI reservations. Reservations are also generated when you expand a virtual disk for a virtual machine with a snapshot. A reservation causes the LUN to be available exclusively to a single ESX host for a brief period of time. Although it is acceptable practice to perform a limited number of administrative tasks during peak hours, it is preferable to postpone major maintenance or configuration tasks to off-peak hours in order to minimize the impact on virtual machine performance.
Remember that changing settings and or defaults could do more harm than good when you're not properly analyzing your environment. Default values are there for a reason.
To come to an actual number of Virtual Machines per VMFS volume we have to gather data first. Data gathering consists of two parts:
- Performance and settings data gathering
- Capacity data gathering
Script: PowerCLI: Deploy Storage to vSphere
Summary: A script to configure storage in vSphere.
Date: 29 December 2011
Refactor: 21 February 2025: Checked links and formatting.
I'm actually quite proud on this script. It will create a volume, LUNs, connects the LUNs to the required esx hosts and creates a datastore. On top of that it also sets the MPIO settings correct on the esx host:
######################################################################################################################## # Author : Sjoerd Hooft # Date Initial Version: 29 December 2011 # Comments: sjoerd_warmetal_nl # # Description: # This script creates a volume per vSphere cluster, then creates LUNs, then connects the newly created lun to the esx hosts inside the cluster and creates the datastores on them. # Make sure to modify the variables for your environment # # Credits: # The idea for this script is partly from https://communities.netapp.com/docs/DOC-6181, which roughly does the same for NFS volumes. # # Recommendations: # This script must run from the PowerCLI. # # Changes: # Please comment on your changes to the script (your name and email address, line number, description): # DATE - USERNAME - EMAILADDRESS - CHANGE DESCRIPTION ######################################################################################################################## # Script Variables $timestamp = Get-Date -format "yyyyMMdd-HH.mm" # Getting credentials for vCenter while ($vcentercredentials -eq $null) { Write-Host (get-date -uformat %I:%M:%S) "Please provide authentication credentials for vCenter" -ForegroundColor Green; $vcentercredentials = $Host.UI.PromptForCredential("Please enter credentials", "Enter vCenter credentials", "intranet\adminshf", "") } # Getting credentials for NetApp filer while ($filercredentials -eq $null) { Write-Host (get-date -uformat %I:%M:%S) "Please provide authentication credentials for NetApp filer" -ForegroundColor Green; $filercredentials = $Host.UI.PromptForCredential("Please enter credentials", "Enter NetApp filer credentials", "root", "") } # Setting variables vSphere $vCenter = "vcenter" $cluster = "BCP Test" # Setting variables NetApp $filer = "netappfiler01" $aggr = "aggr1" $newvol = "ESX_BCPTEST" $numberofluns = 2 $lunname = "BCP_" $lunsize = 4 $size = "g" $igroup = "ESX_BCP_Hosts" $startlunid = 201 # Import DATA ONTAP Module Import-module D:\sjoerd\DataONTAP # Connect to vCenter and NetApp Filer Connect-VIServer $vCenter -Credential $vcentercredentials | Out-Null Connect-NaController $Filer -Credential $filercredentials | Out-Null # Create a new volume $tmpvolsize = (($numberofluns * $lunsize) + 1) $volsize = [string]$tmpvolsize += $size Write-Host (get-date -uformat %I:%M:%S) "Creating volume $newvol in aggregate $aggr with size $volsize." -ForegroundColor Green; New-NaVol $newvol $aggr $volsize | Out-Null # Set options for the new volume Set-NaVolOption $newvol no_atime_update yes | Out-Null Set-NaVolOption $newvol fractional_reserve 0 | Out-Null Set-NaSnapshotreserve $newvol 0 | Out-Null Set-NaSnapshotSchedule $newvol -Weeks 0 -Days 0 -Hours 0 -WhichHours 0 | Out-Null # Creating LUNs $luncounter = 0 $lunid = $startlunid $fulllunsize = [string]$lunsize += $size $esxhost = Get-Cluster $cluster | Get-VMHost | sort | Select-Object -First 1 while ($luncounter -le $numberofluns-1) { $lunname | foreach{ Write-Host (get-date -uformat %I:%M:%S) "Creating lun /vol/$newvol/$_$lunid with size $fulllunsize. " -ForegroundColor Green; New-NaLun /vol/"$newvol"/"$_$lunid" -size $fulllunsize -Type vmware | Out-Null Set-NaLunComment /vol/"$newvol"/"$_$lunid" "LUN for ESX Cluster $cluster" | Out-Null Add-NaLunMap /vol/"$newvol"/"$_$lunid" -InitiatorGroup $igroup -ID $lunid | Out-Null Get-VMHostStorage $esxhost -RescanAllHba -RescanVmfs | Out-Null $path = Get-ScsiLun -vmhost $esxhost -luntype disk | where { $_.runtimename -match $lunid } Write-Host (get-date -uformat %I:%M:%S) "Creating datastore $lunname$lunid on path $path on host $esxhost." -ForegroundColor Green; New-Datastore -vmhost $esxhost -Name "$lunname$lunid" -path $path -vmfs | Out-Null } $luncounter++ $lunid++ } # Rescan storage and datastores on all hosts in the cluster and set MPIO settings for each host. $esxhosts = Get-Cluster $cluster | Get-VMHost $rootpasswordsecure = Read-Host -assecurestring "Please enter the password of the root user to configure the MPIO settings on each host" Function ConvertToPlainText( [security.securestring]$rootpasswordsecure ) { $marshal = [Runtime.InteropServices.Marshal] $marshal::PtrToStringAuto( $marshal::SecureStringToBSTR($rootpasswordsecure) ) } $rootpassword = ConvertToPlainText $rootpasswordsecure ForEach ($esxhost in $esxhosts) { Write-Host (get-date -uformat %I:%M:%S) "Scanning storage on host $esxhost" -ForegroundColor Green; Get-VMHostStorage $esxhost -RescanAllHba -RescanVmfs | Out-Null Write-Host (get-date -uformat %I:%M:%S) "Setting MPIO on host $esxhost. Logging data to D:\adminshf\mpath_$esxhost" -ForegroundColor Green; # Checking for the rsa key to be in cache D:\adminshf\plink -ssh $esxhost -l root -pw $rootpassword echo "Connection to $esxhost succesful" D:\adminshf\plink -ssh $esxhost -l root -pw $rootpassword /opt/ontap/santools/config_mpath --primary --loadbalance --policy fixed > D:\adminshf\mpath_"$esxhost"_"$timestamp" Write-Host (get-date -uformat %I:%M:%S) "Done on host $esxhost" -ForegroundColor Green; }
This wiki has been made possible by:
Script: Bash: AIX: Daily Check Script
Summary: A daily script to check all sort of stuff on AIX.
Date: 27 December 2010
Refactor: 21 February 2025: Checked links and formatting.
#!/bin/bash ######################################################################################################################## # Author : Sjoerd Hooft # Date Initial Version: 27 Dec 2010 # Comments: sjoerd_@_warmetal_nl # # Description: # This is a sample script to perform the daily checks on AIX servers. # # Recommendations: # The script is designed for a 120 column terminal. # The running user must be able to do a passwordless sudo to root. # # Changes: # Please comment on your changes to the script (your name and email address, line number, description): ######################################################################################################################## # Script Variables HOSTNAME_SHORT=`hostname -s` AUTOMATIC=0 BASEDIR=`dirname $0` LOGFILE="$BASEDIR/dc.log" WHATAMI=`basename $0` DATE=`date +%Y%m%d` TOMAIL=sjoerd_@_warmetal_nl BOLD=`tput bold` BOLDOFF=`tput sgr0` # Directories APPDIR="/var/log/APP" WASDIR="/opt/WAS_Profiles/AppSrv/logs" FILE3DIR="/var/data/FILE3" FILE1DIR="/var/data/FILE1/log" JMSDIR="/var/data/app/jms_errors" TOMCATDIR="/var/log/app" # Oracle Variables ORACLE_HOME="/opt/oracle/product/10.2" ORACLE_BASE="/opt/oracle" ORACLE_SID_DB1=db1 ORACLE_SID_DB2=db2 export ORACLE_HOME ORACLE_BASE # Function to pause the script # The operator can evaluate the outcome of the previous function scriptContinue () { if [ "$AUTOMATIC" == "0" ]; then echo "Press ENTER to continue" read CONTINUE clear fi } # Function that will list the AIX internal errors checkErrors () { echo "$BOLD Listing the Error Logging Facility: $BOLDOFF" errpt echo } # Function that will clear all AIX internal errors clearErrors () { echo "Clearing Errors" sudo errclear 0 } # Function that will let the operator view the AIX internal errors in detail viewErrors () { echo "Viewing Errors" errpt -a | less } # Function that will remove all files from the protected directory that holds JMS/MQ errors removeJms () { echo "Are you sure you want to remove the JMS error files from $JMSDIR? " echo "If you hesitate, press CTRL+C to exit the script. " scriptContinue echo "Removing these files: " echo $JMSDIR/* sudo rm $JMSDIR/* echo echo "Done" echo } # Function that will check the last 4 logfiles from 4 different applications # This is possible with multiple for loops since the files are named similar # Known errors are being skipped # It will show only the last 10 entries per logfile checkLog-abs () { echo "$BOLD Checking abs-logs in $APPDIR $BOLDOFF " echo "Note: we check the last 4 logfiles and skip any known error, and limit the amount of lines to 10." for application in appserver1 appserver2 appserver3; do for logfile in app.log.4 app.log.3 app.log.2 app.log; do echo "Checking $BOLD $application-$logfile $BOLDOFF " cat $ABSDIR/$application-$logfile | grep ERROR | \ grep -v 'LDAP: error code 32 - No Such Object' | \ grep -v 'doRefreshProposalsResponse didn.d send the email caught - ignoring' | \ grep -v 'Error getting active tan: No TAN available for user' | \ grep -v 'CORBA OBJECT_NOT_EXIST' | \ tail -10 echo done scriptContinue clear done echo } # Function to check the SystemOut.log from the websphere applications # Known errors are being skipped # It will show only the last 10 entries per logfile checkLog-was () { echo "$BOLD Checking websphere logs in $WASDIR $BOLDOFF " for server in server1 server2 server3 server4; do echo "Checking $BOLD ${server}_Server/SystemOut.log $BOLDOFF " cat $WASDIR/${server}_Server/SystemOut.log | grep -i error | \ grep -v 'oracle.jdbc.driver.DatabaseError.throwSqlException' | \ grep -v 'The Network Adapter could not establish the connectionDSRA0010E' | \ grep -v 'Error creating XA Connection and Resource com.ibm.ws.exception.WsException: DSRA8100E' | \ grep -v 'Error creating XA Connection and Resource java.security.PrivilegedActionException:' | \ tail -10 echo scriptContinue done echo } # Function to check whether files have been processed. # They will have a different extention. checkFiles-host3 () { echo "$BOLD Checking the process on $HOSTNAME_SHORT $BOLDOFF " echo "There should be no files ending on .txt older than one hour:" echo "Last 10 files ending on .txt in $FILE3DIR:" ls -ltr $FILE3DIR | grep '\.txt$' | tail -10 echo echo "$BOLD Checking the process on $HOSTNAME_SHORT $BOLDOFF " echo "There should be recent (last 24 hours) files:" echo "Last 10 files in in $FILE3DIR:" ls -ltr $FILE3DIR | grep '\.txt' | tail -10 echo scriptContinue } # Function that will check whether error files exist # It will allow the operator, after examining the size, to delete them # Continue works only in this menu structure because this is the last check for this host checkFiles-host1 () { echo "$BOLD Checking MQ process error files on $HOSTNAME_SHORT $BOLDOFF " echo "Checking for jms (MQ) errors in $JMSDIR, there should be no files in this directory:" ls -ltr $JMSDIR if [ $AUTOMATIC == 0 ]; then JMSACTION=`ls -ltr $JMSDIR | wc -l` if [ $JMSACTION -gt 1 ]; then echo echo "${BOLD}There are files in this directory!$BOLDOFF If all files are really small ( < 100 bytes ) you can delete them. " echo " Would you like to do that right now?" echo echo "remove - remove all files in $JMSDIR" echo "continue - continue with dailycheck" echo menuChoice fi scriptContinue else echo "AUTOMATIC mode is on. If there are any files run the script manually on $HOSTNAME_SHORT " fi } # Function to check Oracle logfile bdump for errors # It will show the line with the error, as well as the 2 lines before and after # It will show only the last 10 entries per logfile checkLog-ora () { ORALOGDIR="/var/log/oracle/10.2/${ORACLE_SID}/bdump" echo "$BOLD Checking the Oracle logfile $ORALOGDIR/alert_$ORACLE_SID.log $BOLDOFF " echo "The last 10 ORA- messages are displayed, including the 2 lines before and the two lines after " sudo cat $ORALOGDIR/alert_$ORACLE_SID.log | sed -e ' 1{$!N;$d;} $!N;/ORA-/!D $!N;$d;N;p g;$!N;$d;N;D '| tail -10 echo scriptContinue } # Function to check tomcat application servers for errors # It will evaluate all logfiles created the last four days # Known errors are being skipped checkLog-tomcat () { echo "$BOLD Checking the tomcat application server logs on $HOSTNAME_SHORT $BOLDOFF " echo "$BOLD Checking Tomcat logfiles: $BOLDOFF" echo "Checking the last four days of $TOMCATDIR/applicaton.log files" find $TOMCATDIR/app/. -type f -name 'application*' -mtime -3 -print -exec cat {} \; | grep ERROR echo echo "Checking the last four days of $TOMCATDIR/framework.log files" find $TOMCATDIR/app/. -type f -name 'framework*' -mtime -3 -print -exec cat {} \; | grep ERROR echo scriptContinue } # Function to expand the options handling AIX system errors actionErrors () { menuStart checkErrors echo "Note: The system clears all hardware errors automatically after 90 days, and all other errors after 30 days." echo echo "clearerrors - clear all errors now" echo "viewerrors - review errors in less" } # Function to specify which host the script runs on # Declare host specific variables # Set the actions to be taken hostSpecific () { clear if [ "$HOSTNAME_SHORT" == "host1" ]; then checkLog-abs checkLog-was checkFiles-host1 fi if [ "$HOSTNAME_SHORT" == "host2" ]; then export ORACLE_SID=$ORACLE_SID_DB2 checkLog-ora fi if [ "$HOSTNAME_SHORT" == "host3" ]; then export ORACLE_SID=$ORACLE_SID_DB1 checkLog-ora checkLog-tomcat checkFiles-host3 fi } # Function to clear the screen and give the idea of a pretty script menuStart () { clear echo "########################################################################################################################" echo "################################################### Daily Check Menu ###################################################" echo } # Function to show the operator the default menu options menuEnd () { echo echo "errors - take further actions regarding errors" echo "host - start host specific checks" echo "auto - restarts the script and runs it automatically, after which the logfile is mailed to $TOMAIL " echo " - this also works from the commandline: $WHATAMI auto " echo echo "exit - exit" echo } # Function to read the menu option from the operator # This menu is used for all required menus in the script menuChoice () { echo "Enter menu choice: [exit]" read MENUCHOICE if [ -z "$MENUCHOICE" ]; then MENUCHOICE="exit" fi case $MENUCHOICE in errors ) actionErrors menuChoice ;; host ) hostSpecific ;; clearerrors ) clearErrors ;; viewerrors ) viewErrors ;; auto ) $BASEDIR/$WHATAMI auto exit ;; exit ) exit ;; remove ) removeJms ;; continue ) echo ;; * ) echo "Wrong Input" menuChoice ;; esac } # Function to mail the log when the script has run automatically mailLog () { cat $LOGFILE | mail -s "Report $WHATAMI on $HOSTNAME_SHORT of $DATE" $TOMAIL } # Function to determine whether the script should run automatically # Set the automatic variables to send the output to a logfile instead of a screen # and make the logfile readable by removing bold text markers # It also makes sure the logfile gets mailed if [ "$1" == "auto" ]; then AUTOMATIC=1 BOLD= BOLDOFF= exec > $LOGFILE 2>&1 checkErrors hostSpecific mailLog exit fi # Actual script: # Infinite while loop, as long the script is not exited, # start the menu, check for errors and ask the operator what to do while : do menuStart checkErrors menuEnd menuChoice done
This wiki has been made possible by:
