Isilon OneFS Filesystem and UCS create high throughput architecture

Isilon hardware and OneFS filesystem connecting to Cisco UCS creates extremely well-tailored architecture for throughput. Isilon systems have 2x10G ports per chassis. So per Isilon chassis a 1x10G port goes to each UCS FI via appliance ports or intermediate switch. SmartConnect on the Isilon can be tuned to balance traffic in several ways. All 10G ports will be utilized!

OK, but what NAS system can actually handle all that 10G architecture? Isilon!

The “isi statistics drive” command only shows per chassis, but shows utilization per drive. Can easily be scripted to show drives across all chassis. So at a very granular level performance can be measured. Note utilization is very uniform. No hot spots like tradition RAID Groups. That is partly due to Isilon not using a traditional RAID. Vendors like EMC, XIO, and HP have been moving this direction for a while. But Isilon has the idea down better than any other vendor (in my opinion). More on Isilons’ OneFS after cut & pasted data below. If you remove the front covers from the Isilon and hit the lights while it is under load you can actually see the OneFS striping happening in “waves”. It is actually Very Cool to watch.

This 5 chassis Isilon was averaging 60 clients totaling 8Gbps /ifs volume (single shared filesystem space) with a mix of read and write traffic. Checkout the uniform traffic across disks in 1 chassis below. This setup was 96GB RAM each chassis, 2×10 GbE used on each chassis, and 4 SSD’s in each chassis dedicated to MetaData.

Isilon-5# isi statistics drive

Drive Type OpsIn BytesIn OpsOut BytesOut TimeAvg TimeInQ Queued Busy

LNN:bay N/s B/s N/s B/s ms ms %

5:1 SSD 9.8 64K 12.4 102K 0.1 0.0 0.0 0.9

5:2 SSD 13.0 83K 11.4 93K 0.1 0.0 0.0 0.4

5:3 SATA 23.8 1.3M 128.0 7.4M 0.1 4.6 0.6 34.7

5:4 SATA 24.4 1.4M 135.2 7.7M 0.1 4.3 0.6 38.7

5:5 SATA 20.4 1.2M 129.2 7.4M 0.1 4.3 0.6 36.3

5:6 SATA 21.0 1.2M 139.8 7.8M 0.1 4.1 0.5 38.7

5:7 SATA 25.8 1.5M 143.2 7.8M 0.1 4.2 0.6 40.5

5:8 SATA 15.4 815K 111.2 5.9M 0.1 4.2 0.5 26.9

5:9 SATA 20.6 1.2M 105.6 6.1M 0.1 4.2 0.6 31.3

5:10 SATA 17.8 1.0M 119.6 7.0M 0.1 4.3 0.6 34.5

5:11 SATA 20.6 1.2M 129.0 7.2M 0.1 4.2 0.6 37.1

5:12 SATA 23.4 1.3M 126.2 7.2M 0.1 4.3 0.6 36.3

5:13 SATA 22.4 1.3M 133.8 7.5M 0.1 4.5 0.6 34.7

5:14 SATA 26.8 1.5M 121.8 6.9M 0.1 4.2 0.6 32.9

5:15 SATA 16.8 916K 126.6 6.4M 0.1 4.3 0.6 31.1

5:16 SATA 24.4 1.4M 143.2 7.3M 0.1 4.3 0.6 36.3

5:17 SATA 18.8 965K 125.4 6.8M 0.1 4.4 0.6 31.9

5:18 SATA 26.4 1.5M 125.2 7.4M 0.1 4.9 0.6 35.7

5:19 SATA 19.6 992K 113.8 6.6M 0.1 4.3 0.6 33.7

5:20 SATA 18.4 998K 124.8 6.9M 0.1 4.3 0.6 34.1

5:21 SATA 20.0 1.1M 128.2 7.0M 0.1 4.0 0.6 33.7

5:22 SATA 20.2 1.2M 113.0 6.5M 0.1 4.3 0.6 32.1

5:23 SATA 26.4 1.5M 130.8 7.4M 0.1 4.6 0.6 37.1

5:24 SATA 22.4 1.2M 127.6 6.9M 0.1 4.2 0.6 32.5

5:25 SATA 21.4 1.3M 123.6 6.7M 0.1 4.2 0.6 32.5

5:26 SATA 22.2 1.3M 133.0 7.3M 0.1 4.1 0.5 37.7

5:27 SATA 18.2 1.0M 145.4 8.2M 0.1 4.4 0.6 36.9

5:28 SATA 17.2 926K 116.2 7.0M 0.1 4.3 0.6 32.5

5:29 SATA 22.2 1.2M 123.8 7.3M 0.1 4.6 0.6 33.3

5:30 SATA 20.8 1.1M 114.8 6.6M 0.1 4.1 0.5 29.1

5:31 SATA 16.6 882K 120.2 6.8M 0.1 4.2 0.5 32.5

5:32 SATA 19.2 1.1M 130.2 7.8M 0.1 4.3 0.6 37.5

5:33 SATA 16.6 906K 127.4 7.3M 0.1 4.3 0.5 32.1

5:34 SATA 17.2 928K 134.2 7.3M 0.1 4.2 0.6 37.1

5:35 SATA 24.0 1.4M 140.6 8.1M 0.1 4.5 0.6 44.7

5:36 SATA 25.6 1.5M 128.2 7.2M 0.1 4.3 0.6 37.9

Some IOZone numbers to compair 5 chassis X400 system with other storage.

These are all created using a blade on L-UCS running RedHat 5.5

This is the time it took each system to run a “iozone –a” which is a common test for disk performance. It is a series of diverse tests and the time it takes to run is a great baseline. You can see more here http://linux.die.net/man/1/iozone on what the tests are. I like this test because it throws a good deal of stuff at the disk that will be a Pro and Con for different systems. Any architecture gets a good change to sum a good time. It is a fair baseline on any system to compare with others.

GPFS system, copper attached clients.
8 GPFS Servers backending to SAN, on same subnet as 10G UCS Blade.
real 6m56.689s
user 0m2.757s
sys 0m43.146s

XIO ISE2 SSD SAN
Boot LUN (Dam!!)
real 2m58.728s
user 0m2.992s
sys 1m23.663s

Isilon
5 Chassis x400 with all 10G interfaces in use on same subnet as 10G UCS blade.
Did both NFS and SMB mounts
NFS (Nice!)
real 4m21.685s
user 0m2.960s
sys 0m51.298s

SMB (as you can see SMB mount via Linux is just Crap)
real 19m33.431s
user 0m2.704s
sys 1m48.486s

Very good youtube’s on Isilon Architecture.

How Isilon is different from traditional storage architecture
http://youtu.be/Ceq46ieR5RE

VMWorld 2009: “Isilon Systems, Scale-Out Storage for Large Scale Server Virtualization”
http://youtu.be/fQDBRqMjhVM

This below section about OneFS from an “In-depth look at Isilon” article found here http://arstechnica.com/business/raising-your-tech-iq/2011/05/isilon-overview.ars explains the filesystem well.

“The OneFS file system takes files, breaks them up into 8K blocks, and then stripes those blocks in 16-block units across the disks of as many nodes in the cluster as it’s allowed to use, depending on the configured protection level. The striping goes “wide” (that is, across nodes) before it goes “deep” (across multiple disks in one node)—that is, OneFS tries to get data written to one disk in every node possible first, before then wrapping back and putting data on a second disk in a node. If the file is big enough that it stretches across one disk in every node, the striping will wrap around to the second disk in the first node and start again.

Isilon OneFS Striping

A big file being striped across a three-node Isilon cluster. It gets broken up into 128KB pieces, then written first to node one’s first disk, then node two’s first disk, then node three’s first disk. This repeats until it runs out of disks, and then it wraps around again. I didn’t draw the subsequent striping because then we’d be here all night.

Files don’t necessarily fit perfectly into 128KB pieces, though, and so when the Isilon system gets to the last piece of the file it’s writing and finds it has less than 128KB left, it pads the chunk out to 128KB and commits that. Conversely, some files are a lot less than 128KB, and so they are by default mirrored instead of striped.

Tied into this picture of striping is the underlying reason why Isilon works so well for big files, like giant raw video files of blue cat people for Jim Cameron or thousands and thousands of multi-gigabyte iTunes movies: large enough files get striped across lots of disks inside lots of nodes, and access to those files is serviced by what could be a very large amount of distributed CPU and global cache”

Craig

Isilon OneFS Filesystem and UCS create high throughput architecture

Like this:

Be First to Comment

Leave a Reply Cancel reply