ArcGIS NVMe disk performance?

ChrisSnyder · ‎05-08-2017

Anyone running a machine with these newfangled NVMe disks?

We purchased a newer workstation with some of these, and while the "industry benchmark" tests we have run(CrystalMark) indicate the disks are indeed super fast and operating as expected, the real world test that we have run in ArcGIS and Python have not shown a significant performance increase over old school SATA HDD platter disks.

Anyone know why this might be the case? My only assumption is that disk speed is not the bottle neck... If it helps, here's an excerpt of some of the tests we are running. Note that not all of these involve use of disk i/o and there are some others (not shown) that are just simple unions and dissolves and stuff:

#Process: Export raster to pnts (about 2 million points)
rasterPntsFC = os.path.join(fgdbPath, "raster_pnts")
time1 = time.clock()
arcpy.RasterToPoint_conversion(conRst, rasterPntsFC, "VALUE")
time2 = time.clock()
benchmarkDict["RASTER_TO_POINTS"] = time2 - time1
logMessage("RASTER_TO_POINTS = " + str(time2 - time1))

#Process: Build a large dictionary independent of disk
time1 = time.clock()
randomDict = {}
sum = 0
i = 0
for x in range(1,2001):
    for y in range(1,2001):
        i = i + 1
        randomDict[x,y] = [random.randint(1,1000)]
        sum = sum + randomDict[x,y][0]
        randomDict[x,y].append(sum / float(i))
time2 = time.clock()

del randomDict
benchmarkDict["BIG_DICTIONARY"] = time2 - time1

#Make another dictionary, but one sourced from disk
time1 = time.clock()
pntDict = {r[0]:r[1] for r in arcpy.da.SearchCursor(rasterPntsFC, ["OID@", "grid_code"])}
time2 = time.clock()
benchmarkDict["POINTS_TO_DICT"] = time2 - time1

#Sort something pretty big
time1 = time.clock()
sortList = sorted(pntDict.items(), reverse=True)
sortList.sort()
time2 = time.clock()
benchmarkDict["SORT_DICT_ITEMS"] = time2 - time1

#Write a big txt file
time1 = time.clock()
testTxtFile = os.path.join(benchmarkFolderPath, "test_text_file.txt")
f = open(testTxtFile, 'a')
for i in range(10000000):
f.write(str(i))
f.close()
time2 = time.clock()
benchmarkDict["WRITE_BIG_TXT_FILE"] = time2 - time1

DanPatterson_Retired · ‎05-08-2017

I have them on my desktop (custom built) and on this Microsoft Book.

The drives are faster in the testing I did a year or so ago on my desktop, since I have two of them and an old-school drive. I put my operating system and programs on one and I use the other for data with the conventional for backup.

If you are using python 3.X then make sure you are using time.perf_counter instead of time.clock which ahs been deprecated.

One complicating factor is memory. I was running some tests calculating point distances (50 million distances) and writing a numpy array to disk and it all took under a second. I will try to find the tests if I can. But I would look at your combination of memory, drive type and 32/64 processing issues.

If you have a specific short reproducible test, I can run them on both machines for comparison if you like.

ChrisSnyder · ‎05-08-2017

Thanks for the input Dan. So yes, I do indeed have some data and a script! Should take 3-4 minutes to run. The script will need a bit of path updating BTW. Maybe just uncomment that whole thing at the end which writes a dbf file to a network location.

ftp://ww4.dnr.wa.gov/frc/for_dan/

If you could post your results (the txt log file would be fine) I'd love to see the numbers from a different NVMe machine. One thought I had is that it could be our corporate virus scan software, which our IT Dept. loves to crank up to the "max slowness" setting.

DanPatterson_Retired · ‎05-08-2017

It doesn't like me... just spinning and won't connect.

What is the file size on disk? I may be able to emulate a write and read for you to test.

EDIT

If you have scipy and numpy, this test produces a 40MB file with a reproducible standard values, shape and size.

Test results are for the Microsoft Book

import numpy as np
from scipy.spatial.distance import cdist

N = 50000000  # 50 million destinations, 1 origin
a = np.random.mtrand.RandomState(1).randint(0, 10, size=(N,2))
b = np.random.mtrand.RandomState(2).randint(0, 10, size=(1,2))

d = cdist(a, b)  # for saving

%timeit cdist(a, b)
1 loop, best of 3: 697 ms per loop  # scipy distance  calculation

%timeit np.save("c:/temp/d.npy", d)
1 loop, best of 3: 1.29 s per loop

%timeit np.load("c:/temp/d.npy")
1 loop, best of 3: 219 ms per loop

# file size 38.1 MB  400,000,080 bytes‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

The test run for the calculation, the save and read times could be used for comparison. I will test on my desktop when I get a chance.

DanPatterson_Retired · ‎05-09-2017

Still a no-go on the link Chris

ChrisSnyder · ‎05-09-2017

Hi Dan,

Your code didn't work for me right out of the gate, so I rewrote it like this:

import numpy as np
from scipy.spatial.distance import cdist
import timeit
def test1():
    return cdist(a, b)
def test2():
    np.save(driveLetter + ":/temp/d.npy", d)
def test3():
    np.load(driveLetter + ":/temp/d.npy")
n = 50000000 # 50 million destinations, 1 origin
a = np.random.mtrand.RandomState(1).randint(0, 10, size=(n,2))
b = np.random.mtrand.RandomState(2).randint(0, 10, size=(1,2))
d = cdist(a, b) # for saving
for driveLetter in ["c","d","e"]:
    print("On the " + driveLetter + ":\ drive...")
    print("-Test #1 (proc speed)")
    for i in range(3):
        print("--"+str(timeit.timeit(test1, number=1)))
    print("-Test #2 (write speed)")
    for i in range(3):
        print("--"+str(timeit.timeit(test2, number=1)))
    print("-Test #3 (read speed)")
    for i in range(3):
        print("--"+str(timeit.timeit(test3, number=1)))

In my case:

C:\ is an NVME drive

D:\ is actuially 4 NVMe drives in RAID0

E:\ is a 7.2k HDD SATA

My results are (in seconds):

On the c:\ drive...
-Test #1 (proc speed)
--0.883544537654
--0.878839311374
--0.902597402597
-Test #2 (write speed)
--0.448425204932
--0.427565927223
--0.427990160524
-Test #3 (read speed)
--0.262958274602
--0.261044435017
--0.267392538968
On the d:\ drive...
-Test #1 (proc speed)
--0.883161701312
--0.968546179848
--0.957553405499
-Test #2 (write speed)
--0.41787253842
--0.361432745337
--0.361836109096
-Test #3 (read speed)
--0.249162823478
--0.251594980362
--0.257517482517
On the e:\ drive...
-Test #1 (proc speed)
--0.863139258002
--0.902649405389
--0.906740519754
-Test #2 (write speed)
--2.14880975189
--2.11361173074
--2.09666840009
-Test #3 (read speed)
--0.247759090225
--0.251948051948
--0.251255935845

My equipment is an HP z840 workstation with dual socket Xeon 2687v4 processors w/ lots of RAM. C:\ drive is one of these: http://www8.hp.com/us/en/workstations/z-turbo-drive.html and D:\ drive is one of these http://www8.hp.com/us/en/workstations/z-turbo-drive-g3.html in a RAID0. At any rate, I am not super impressed and most likely there is something off with the system config...

Not sure why my FTP link doesn't work for you... I reposted the test data as a zip file (~25 MB) to make it easier. Try just pasting ftp://ww4.dnr.wa.gov/frc/for_dan/ into windows explorer.

Thanks,

Chris

DanPatterson_Retired · ‎05-09-2017

Ok ... sorry I should have mentioned

Python 3.5.3, scipy '0.18.1' etc 8Gig ram

C D E me

test 1 creation .9 .9 .9 .7 seconds .... similar

test 2 write .4 .4 2.1 1.3 interesting!

test 3 read .25 .25 .25 .25 interesting again

I will move to the 'machine' tomorrow morning and try file size as well, since I have the varying drives on it as well with similar specs to yours.

I will also try the ftp through explorer as well.

Thanks for testing

ChrisSnyder · ‎05-09-2017

Yeah, I am starting to think that the benefits of the NVMe disks stuff might only shine through when doing really big multithreaded processing stuff. Possibly these single threaded tests don't have enough data throughput to reveal disk read speed as a bottleneck?

ChrisSnyder · ‎05-17-2017

Dan - Did you ever get any results from my script/data?

DanPatterson_Retired · ‎05-17-2017

Chris... sorry I got waylaid.. but I did quite a bit of reading on the subject which was leading me off in all kinds of other directions. I have yet to find a suitable comparison since most of the readings were focused on launch times of programs and access to data and whether the data are contiguous in memory or not. I haven't forgotten... sorry. One advantage... pure silence... no whirring and clicking of the drives