Monday, November 11, 2013

Optimizing NetApp Block IO for optimal Performance (SQL)

Thanks to

A blog by Nathan Storms (the architect evangelist):


I came across his blog and found it very useful when setting up our Virtual SQL servers. This is focused on optimizing performance within the NetApp Data OnTap solution and not with how you may be connecting.


Thin provisioning, common belief is that thin provisioning weather it is on the volume or the LUN will lead to performance issues due to fragmentation over time and that it is best practice to full reserve your volumes, LUNs and snapshot reserve. This belief is false the WAFL (Write Anywhere File Layout) will fragment sequential blocks over time, environments were you have large files such as virtual disks or large database files are more prone to the effect of a performance impact over time. This is a way by design NetApp writes anywhere on the aggregate in a effort to reduce the latency to write data to disk, the problem is that over time the performance grains of sequential reads are lost as data becomes fragmented and read ahead benefits are lost. Given how data is written their is no reason to fully reserve you volumes and LUN’s. Next lets look at how to easily fix this and other performance issues.
First we need to configure a NetApp option for de-duplication, by default when you enable sis on a volume the schedule is to automatic and started when a percentage of delta occurs on the volume. This is great as long as you a bunch of volumes and multiple volumes don’t decide to de-dupe at the same time, so you could set each volume on a schedule and de-dupe volumes that don’t need it. Or you could limit the number of concurrent volume that can de-dupe at the same. Use the following command to set the limit:
NetApp>options sis.max_active_ops 1
Example: Thin provisioning block storage to a Hyper-V host for a virtualized SQL server, the VM will need 4 virtual disks:
  • 60GB: Operating System Disk
  • 400GB: SQL Data Disk
  • 200GB: SQL TLog Disk
  • 100GB: SQL TempDB Disk
The first step is to perform a sizing calculation we are going to allocate a lot of white space on disk be it won’t be lost because it is all thin provisioned.

120GB = 60GB Virtual Disk + 60GB for Hyper-V Config and Hyper-V Snapshots + 24GB (20% Reserve)
480GB = 400GB Virtual Disk + 80GB (20% Reserve)
240GB = 200GB Virtual Disk + 40GB (20% Reserve)
120GB = 100GB Virtual Disk + 20GB (20% Reserve)

Adding it up we will provision 960GB in LUN allocations if we add our NetApp 20% snapshot reserve (196GB) we will end up creating a thin provisioned volume that is 1156GB in size.
Note: If we were building two SQL servers for database mirroring we put LUNs for both virtual SQL servers in the same volume so that de-duplication would reclaim additional storage.
To create the volume and LUNs and map them to a hyper-v host initiator called HV1 we would do the following:
vol create volSQL1 –l en_US –s none aggr0 1156GB
vol options volSQL1 no_atime_update on
vol options volSQL1 snapshot_clone_dependency on
vol options volSQL1 read_realloc space_optimized
snap autodelete volSQL1 on
snap sched volSQL1 0 1 0
sis on /vol/volSQL1
sis config –s auto /vol/volSQL1
reallocate measure /vol/volSQL1
qtree create /vol/volSQL1/qtree
lun create –s 144g –t hyper_v –o noreserve /vol/volSQL1/qtree/SQL1_OS.lun
lun create –s 480g –t hyper_v –o noreserve /vol/volSQL1/qtree/SQL1_DATA.lun
lun create –s 240g –t hyper_v –o noreserve /vol/volSQL1/qtree/SQL1_TLOG.lun
lun create –s 120g –t hyper_v –o noreserve /vol/volSQL1/qtree/SQL1_TEMPDB.lun
lun map /vol/volSQL1/qtree/SQL1_OS.lun HV1
lun map /vol/volSQL1/qtree/SQL1_DATA.lun HV1
lun map /vol/volSQL1/qtree/SQL1_TLOG.lun HV1
lun map /vol/volSQL1/qtree/SQL1_TEMP.lun HV1
The script above does the following tasks:
  • Create a new thin provisioned volume in aggr0
  • Set the volume to not record the last access timestamps on the LUNs within the volume
  • Set  the volume allow us to delete snapshots from the volume even if a busy snap is present
  • Set the volume perform read reallocations, (read below for more detail)
  • Set the volume to automatically delete snapshots if you are running of space
  • Set the volume to not take snapshots as this should be done with a NetApp SnapManager
  • Enabling de-duplication on the volume
  • Setting de-duplication on the volume to automatic
  • Enabling reallocation measurements on the volume
  • Creating a qtree within the volume
  • Creating multiple thin provisioned LUNs for host type of Hyper-V
  • Mapping multiple LUNs to a initiator with a name of HV1
To resolve the defragmentation created over time by WAFL we use reallocation (defragmentation) their are two ways to reallocate:
  1. Manual reallocation, if you have never preformed a reallocation on an existing volume you may want to manually run a single pass of a reallocation task. Note to run a manual reallocation you must remove your snapshots from the volume. This may take some time to complete and is an expensive operation on system resources.
  2. Read reallocation, as blocks are read the NetApp automatically examines the block and optimizes the data layout by moving the block to an optimized location on disk (small defragmentation)
This is interesting because as data is written to the WAFL in a fragmented manager for performance when your application reads that data it automatically gets optimized for performance. This also has another advantage if your IO pattern changes it will automatically adjust to the new pattern. For example if a SQL table was re-indexed/reorganized on disk or the access pattern was changed by a modification to a stored procedure the NetApp’s read reallocation would automatically detect the change in the IO pattern and optimize it on disk.

Reallocation measurement was turned on so that you can pull a reallocation status to see the state of fragmentation on your volumes with ‘reallocate status’.

Using this design you will obtain the cost benefits of thin provisioning, and de-duplication while maintaining optimal performance over time automatically.

How to optimize disk read performance in NETAPP DataONTAP

Issue:
Read performance on a LUN presented by a Netapp controller is becoming increasingly sluggish or slow.  You may have noticed high latency times (20ms or higher) or the latency of the LUN has steadily increased over time as in the examples shown below.

Example graph showing latency trending over time

Example output from the stats show command placed in a table for ease of display.

avg_latency
Instance
ms
/vol/volume_name/lun1-Xn/8W3OdfJbw
28.12
/vol/volume_name/lun1-Xn/8W3OdfJbw
9.36
/vol/volume_name/lun1-Xn/8W3OdfJbw
11.38
/vol/volume_name/lun1-Xn/8W3OdfJbw
28.27
/vol/volume_name/lun1-Xn/8W3OdfJbw
22.88
/vol/volume_name/lun1-Xn/8W3OdfJbw
11.75
/vol/volume_name/lun1-Xn/8W3OdfJbw
14.6

Causes:
Just like all other file systems the WAFL file system used by Data Ontap can be a victim of fragmentation in very large files and LUNs.  Most likely your LUN (aka a single large file) has become fragmented and you should perform a defragmentation by running the reallocate command against the LUN or volume.

Solution:
Run the following command against the LUN to check for fragmentation.
reallocate measure -o /vol/volume_name/lun

Check the status by using this command
reallocate status

Example output
/vol/volume_name/lun
        State: Checking: snap 0, public inode 101, block 943500 of 19661543
     Schedule: n/a
     Interval: 1 day
 Optimization: 5 [hot-spots: 18]
  Measure Log: n/a

The optimization scale is from 1(optimized) to 10 (very un-optimized) and is how the WAFL file system identifies fragmentation.  The usually accepted minimum threshold for optimization is 3 or 4 with no hot spots.

To defragment and optimize the LUN run the following command
reallocate start -f -p /vol/volume_name/lun

The -f (force) option performs a one-time full reallocation of a file, LUN or an entire volume.
The -p option reduces the extra storage requirements in a flexible volume when reallocation is run on a volume with snapshots.