GIS Jake

Saturday, July 21, 2018

Wonderland Post 1

Last month I visited the Yellowstone area with some family. On this trip I discovered The Story of Man in Yellowstone, published in 1949 by Merril Beal. It gives a great (if aged) review of the recorded history of human (or at least non-Native American) exploration of the Yellowstone area. He calls the area Wonderland, which I think is great and I'm totally using from now on.

While reading this book, I had to constantly refer to a map because the landmarks Beal references were all mountains, rivers and lakes—there was no Norris Junction or Old Faithful Village in the early 1800s. I realized that my (and I imagine most everyone else's) mental map of Yellowstone is heavily influenced by the standard NPS park map.

This is no knock on the NPS map—it is a beautiful map that is fantastic at doing it's job (I'd love to chat with Tom Patterson someday about the NPS maps and terrain mapping). However, that job is to show you how to get to a very select group of relatively easily accessible attractions. Most mountains, mountain ranges, rivers, and lakes are buried deep in the visual hierarchy.

So, after all that intro, my goal with this map is to show the Wonderland that Colter (probably) saw, the Wonderland that the Folsom and Washburn expeditions later explored. I want to flip the NPS map's hierarchy. Natural features will be the focus, while man-made roads and arbitrarily-delineated attractions will fade into the background as reference points.

While we were visiting, we drove to the Grand Canyon of the Yellowstone. A sign between Norris and Canyon Village said we were crossing the caldera rim. While I could pull the NPS map out and see where the rim was, the terrain was so muted I had no way to relate that to what I was seeing out the window, or mentally zoom out and situate my experience into the physical area as a whole. This map should fix that.

Elevation

The first and most important part of the map is the underlying terrain. If we're going to talk about mountains and ridges, they'd better pop. Also, thanks to the caldera and other tectonic goodies that make the area what it is, there are some interesting stories to be told about the area. Existing maps just don't do these justice.

Elevation maps (hypsometric tints) are tough to do. You often see a kinda-natural-but-definitely-not color ramp, like on those giant roll-up physical relief wall maps at school (do schools still have those? I sure hope so, even if only so the current generation knows the frustration of trying to get one to roll back up). Or, you see some that try to be realistic as possible. Unfortunately, land cover is only partially determined by elevation, so this often strays into the uncanny valley.

For Wonderland, I chose to approach the uncanny valley head on, but I'm really pleased with the colors I finally came up with. I think they do a good job showing the elevation changes while sticking to a mountain forest color pallet. You can really see the uplands in the southwest part of the park, and how the caldera really dominates the center while the northern and eastern edges are more "traditional" mountainous areas.

I'm still playing around with the bright greens, but everything else is really working for me. This was the result of lots of little tweaks to the North America County San Bernardino 3 ramp from ESRI's Terrain Tools download.

Hillshade

Hillshading is another vital part of terrain maps, the thing that tweaks your brain into seeing mountains and valleys. I usually prefer to use the GDAL hillshade command with the combined option, which takes out a lot of the grays from the default algorithm. Because I'm focusing on the terrain, I chose a z scale of 5 to try to make it pop a little more.

The default 1/3 arc-second DEMs from the USGS can have lots of artifacts that aren't natural (look at the ripple effect along the gullies in the unsmoothed image below). To fix this, I smoothed the DEM using a DEM super-tool I've created and will probably write more about in the future. My first try is a Gaussian blur with a radius of 30 pixels (which is a different blur than the standard focal mean tool in ArcGIS, if I understand that tool correctly).

Comparing it to the unsmoothed image, I may try tightening the radius to preserve some of the edges. Or I may try some other cool smoothing techniques. We'll see.

Unsmoothed

Gaussian Blur, 30 Pixels

Finally for today, one oddity.

One of my biggest requests/hopes/desires/dreams for ArcGIS Pro is for different blend modes for each layer, instead of the standard transparency slider. This is especially useful for hillshade and elevation mashups like this, where a multiply mode would make my heart sing for joy (QGIS has it!).

I've heard rumblings that it may be in the long-term plan, but until then I seem to have a bug in my setup that does it for me already:

This image has the exact same settings as the ones above, just for some reason occasionally it draws weird. I have to turn different layers on and off before it reverts to what I'd expect.

Friday, January 5, 2018

Zion's Fireworks: A Map of Canyons in Zion National Park

I recently created this map of Zion Canyon in Zion National Park for a good friend who loves canyoneering, and I really like the result; it ticks all the right boxes for me. The hillshade exposes the drama and excitement of the terrain, and I think I got the elevation color ramp pretty close to awesome.

As a canyoneer you read about all these great canyons, but sometimes it's hard to get a good mental map of where they are in relation to more common points of interest. The USGS topos usually label canyons and routes that took their name from an old established canyon name, like Heaps, Imlay, and Behunin Canyons. However, names like Das Boot and Misery Canyon were created by the canyoneering community and are nowhere to be found. Fortunately, the Geographic Names Information System (GNIS) from the USGS includes most of these canyons, allowing us to create our own map:

(Keep reading after the map for a little bit more on the creation process)

Click here for a higher-resolution image

The Process

I really wanted to show of the majesty of the landscape, to show how the canyons cut sharp clefts into the landscape. I found 1-meter elevation data from the USGS for the Park, but it had some annoying artifacts that appear to come from the lidar data or processing:

However, it smoothed up quite nicely at this scale using the MDenoise program created by Sun et al (2007) while still retaining a lot of the sharp details of the sheer red-rock cliffs:

I added a slope layer as well, symbolized as flat = white to steep = black and made it about 80% transparent. This doesn't show up much in the canyons, but brought out a little bit more detail in the lower cliff face of Zion Canyon.

Then, I created a custom color ramp based on the colors from the NPS' geology guide for the park to echo the actual rock layers and formations. It does a decent job, but can't quite do it justice without some representation for the vegetation on the plateau tops. It's also not properly indexed; the slight tilt in the strata relative to a flat sea-level "zero elevation point" means that, on opposite ends of the park, you could see different formations at the same elevation. In this map, every point of a given elevation is the same color.

The roads and rivers layers should hopefully help orient you to places you remember if you've visited the Park: the switchbacks up to the tunnel, and the end of the road at the Temple of Sinawava. I've let the flow of the labels, along with the hillshade, reveal the canyons themselves by setting the canyon lines to "No Color." Finally, a couple points of interest, like The Sentinel and Angels Landing, are identified to give a little bit more context and orientation.

And a Little Bit of Map Philosophy

All maps are, to an extent, an abstracted representation of reality. Some try to eliminate this as much as possible, using satellite imagery and high-accuracy polygons for every feature imaginable. Others embrace their abstractness, using labels in place of lines and funky symbols that represent an entire town.

With its color ramp and hillshade, I want this map to straddle the border between representational and realistic. The colors don't match exactly with what formations are found in the exposed strata at any given exact place in the canyon. But, with the whites and greens capping the domes and plateaus and the changing shades of tans and brownish-reds descending to the canyon floor, your mind (hopefully) is reminded of the grand wonders that exist in the real world.

I hope we never lose the ability to be amazed of these wonders.

Monday, January 1, 2018

A Functional Introduction to GDAL, Part 2: GeoTiff Compression

So you just got a hard drive containing some gorgeous new 6" resolution aerial imagery for your hometown and you can't wait to play with it. Upon opening the imagery folder you're hit with a grim reality: high-res imagery file sizes are huge.

Compression techniques allow us to eliminate redundant data in a raster file in such a way that we can recreate the original file (lossless compression) or enough of the original to look good (lossy compression).

Because of it's ubiquitous support in other software and multitude of compression options, this post will focus on some of the compression options available in the GeoTiff driver. The different compression schemes can be specified with the creation option -co compress=<type>. The GeoTiff driver page has a list of all the available compression schemes.

Lossless and Lossy Compression

Compression methods fall into two general categories: "lossy" and "lossless."

"Lossy" compression schemes throw away chunks of data that they think aren't too important for the sake of smaller file sizes. jpeg compression relies on the way the human brain processes visual imagery to throw away varying amounts of "extraneous" data that (depending on the amount of compression specified) we don't usually notice. This results in files that are drastically smaller than if we had full three 8-bit values for each raster cell.

The reason it's called lossy compression is because you will lose data. Once you use a lossy compression scheme on a raster, you can not recover the original data. You can get something that looks very visually similar, but the underlying values are irrevocably changed.

In contrast, "lossless" compression schemes use patterns and other mathematical magic to compress the data in a way that it can be perfectly reproduced when it's decompressed. The compression ratios for lossless compression schemes usually aren't quite as good as lossy compression schemes, but you keep all of your original data.

As a rule, never use lossy compression on data where the cell value is important, like elevation data or reflectance values in multispectral imaging. It does work great for aerial imagery that you're just trying to use as a basemap, however.

Lossy Compression with JPEG

My preferred aerial imagery compression scheme is jpeg with YcBcR color representation (-co compress=jpeg -co photometric=ycbcr). If you're big into color theory you can probably explain better than I can what YcBcR is and why exactly it helps, but long story short it can sometimes give you better jpeg compression ratios (read: smaller files) for aerial imagery.

You can also play around with the jpeg quality creation option (higher quality = less data loss but larger file size), I've found the default of 75 to work pretty good for the projects I've done without major (or even noticeable) visual degradation. This gives me files that are roughly 5% to 10% the size of what they'd be without any compression. I've seen smaller ECW files from vendors, but this is close enough for what I need.

One gotcha I've come across with jpeg compression is artifacts, both inside the image data area and in the NoData areas along some borders. I'll write more on this later, with visual examples, but here are two guidelines to reduce/eliminate jpeg compression artifacts:

To eliminate artifacts on the outside edges, make sure the original image data area is perfectly rectangular. An image of merged tiles that shift down a couple pixels every tile, creating a stair-step border of white NoData areas when merged together, will show some artifacts in the white NoData areas.
To eliminate artifacts inside the data areas on the right and bottom of your image, make sure the tile/stripe size is a multiple of 8 or 16 (see the note about RGB jpeg compression on the GeoTiff driver page linked above): -co blockxsize=256 -co blockysize=256. If that doesn't work, make sure your total image height and width are multiples of your block size (see this mailing list thread).

Lossless Compression with Deflate and LZW

When I'm working with large elevation datasets and want compression that preserves the original data, I'll usually use deflate compression (-co compress=deflate). This generally seems to give me higher compression ratios than LZW compression (the other main lossless compression scheme)—I've seen it reduce file size by around 50%, but your mileage may vary based on specific datasets.

You can use the -co predictor and -co num_threads options when working with LZW or deflate compression to speed up the compression. If you're just working with aerial imagery, -co predictor=3 doesn't make sense because your data isn't in floating point format (i.e., it's all whole numbers). These can be useful for very large datasets, but the smallish example dataset below only took a few seconds to compress no matter the method.

Compression Comparison

I used the San Francisco Bay image available from Earthstar Geographics to do a basic comparison of the compression schemes we've discussed. The image is 7204 x 7204 pixels in three bands of type byte, originally in the GeoTiff format. Note that the compression percentages will vary between every image (the relatively poor compression for LZW surprised me here) but the general ranking should stay the same.

Compression Scheme	File Size	% of Original File Size	Lossy/Lossless
Uncompressed	155.9 MB	n/a	n/a
Deflate (z=6)	112.7 MB	72.3%	Lossless
LZW	142.9 MB	91.7%	Lossless
jpeg (75% Quality)	25.1 MB	16.1%	Lossy
jpeg (YcBcR, 75% Quality)	9.1 MB	4.8%	Lossy
jpeg (YcBcR, 90% Quality)	14.9 MB	9.49%	Lossy
ECW (2:1, from vendor)	28.3 MB	18.2%	Lossy

Note: The file sizes here were calculated on a Mac, which defines a MB as 1000^2 bytes (decimal-based), while Windows currently defines a MB as 1024^2 bytes (binary-based). If you download the example file yourself, you'll see the metadata file says its 148 MB, which would be the Windows version of the size. Confusing? Yeah.

The jpeg compression is the clear winner here, and zoomed in to actual size (one pixel in the image is one pixel on my monitor) I couldn't tell any difference even between the 75% quality jpeg and the uncompressed GeoTiff.

Compression and BigTIFF

The default TIFF format only allows for files 4 GB or smaller. GDAL overcomes this barrier with the BigTIFF creation option that allows for really big files. The default option is if_needed, which tries to figure out if the new file will be greater than 4 GB and sets the flag if needed. Because this is the default, you have probably already created uncompressed GeoTiffs larger than 4 GB without even realizing this option exists.

However, compression throws a wrench into this default. The driver doesn't know how well the compression will work and fails to set the BigTIFF option, which often leads to write errors when creating large datasets. Whenever I'm working with large compressed datasets I use -co bigtiff=yes if there's any possibility the output will be greater than 4 GB.

ECW and MrSid

I've mentioned "ECW" and "MrSid" a couple times in this series. These are different formats (not compression options for GeoTiffs) that use mathematical magic known as wavelets to achieve lossy compression ratios greater than what's possible with jpeg, and with better image quality and data access to boot. Several Jpeg2000 compression schemes also use wavelet-based compression in a similar matter.

The potential downside is that these are not open formats—they are protected by patents and require special (and possibly expensive) licenses. There's usually a free license for reading the data (which is what allows GDAL to read them), but creating them (or in the case of ECW, distributing images from a server as part of a webmap) require ponying up the license fee. You're the only one who can decide if that route is best for you.

Next up

Our next post will investigate gdal_translate, one of the most common and most useful GDAL tools.

Notes
Paul Ramsey's excellently concise compression discussion: http://blog.cleverelephant.ca/2015/02/geotiff-compression-for-dummies.html
Kersten Clauss' discussion (and test script!) on GDAL's lossless compression options: http://www.digital-geography.com/geotiff-compression-comparison/
A deep discussion on jpeg compression by Even Rouault, one of the GDAL developers: http://erouault.blogspot.com/2014/04/advanced-jpeg-in-tiff-uses-in-gdal.html (Seriously, this guy is awesome. Super knowledgeable and very active on the gdal-dev mailing list).

A Functional Introduction to GDAL, Part 1: Creation Options

Many GDAL tools create a new file that holds the results of its work. Most of these tools support the -of, -ot, and -co switches to set the output format, the output data type, and any format-specific options.

Setting the Raster Format: -of <name>

There are dozens upon dozens of different raster formats in the GIS world, and GDAL can handle many of them. The ability to read or write a format is provided by what GDAL calls "drivers," and there's a specific driver for each format.

While GDAL can read just about anything you throw at it, it can only write a handful of formats. The list of supported formats can be found in the GDAL documentation. The "Create" column will tell you if write support is present for your desired format. "Copy" is slightly different from create and depends on the raster data already being present. If you're opening and modifying an existing raster, you can use a format that has Copy capabilities; if you're creating a new raster, you need to use a Create format. See the CreateCopy() documentation for more info.

Many formats require additional work—for example, the ECW driver requires the ECW SDK from Hexagon Geospatial that has two different types of licensing. The read-only license is freely available for desktop use (which does not allow you to use it with server software distributing imagery), while the server read-only and desktop read-write requires a paid license. Getting this working is beyond the scope of this post, though I think I've seen the stand-alone QGIS installation come with read support enabled.

The name in the "Code" column is the "short format name" mentioned in the help documentation and is the name you'll use with the -of switch. For example, if I want a GeoTiff, I'd type -of GTiff.

Note: As of GDAL 2.3, GDAL tries to guess the output format based on the file extension of the output filename you provide—to get a GeoTiff, you'd just have to have your filename be .tif. In previous versions, it defaults to GeoTiff if no format is specified with -of.

One format that you'll notice doesn't have creation support is jpeg, which seems odd at first blush. However, if you want highly-compressed rasters, the jpeg compression method for GeoTiffs (see the next post on compression options) provides comparable results, just with a .tif file.

Specifying an Output Data Type: -ot <type>

Different datasets will have different basic data types, which define how the data for each raster cell are stored. A DEM with elevations from 1272.3 to 3311.9 may be in Float32, while aerial imagery may have three Byte bands—red, green, and blue, with values from 0 to 255.

Each data type has specific minimum and maximum values it can represent. In addition, Byte and the various Ints (or integers) can only represent whole numbers, while the Float types allow you to represent decimal numbers, like an elevation of 1564.357 (remember, this value by itself is unitless—you need to check your metadata to see whether it's an elevation in feet, or just a brightness value from 0 to 255).

The following table shows the minimum and maximum values for each type (assuming they match the equivalent C++ types). This is taken from the GDAL API reference and the Microsoft C++ reference (the various int min/max values should stay consistent with other compilers and platforms, but there may be slight variations in the Float32 and Float64 values).

Type	Minimum	Maximum	Precision
Byte	0	255	Whole Numbers
Uint16	0	65,535	Whole Numbers
Int16	-32,768	32,767	Whole Numbers
Uint32	0	4,294,967,295	Whole Numbers
Int32	-2,147,483,648	2,147,483,647	Whole Numbers
Float32	3.4x10^-38	3.4x10^38	6 Decimal Places
Float64	1.7x10^-308	1.7x10^308	15 Decimal Places

The -ot switch allows you to specify the type of the output file. There are a couple nuances here, however, that we have to keep in mind:

First, specifying a data type does not automatically scale the data to fit into that data type. Rather, it just tries to copy the value straight across, clipping out data that doesn't fit. Going from Byte to Float32 is just fine: 232 becomes 232.0. However, going the other way presents problems: 67.395 becomes just 67, and 5692.845 gets clipped down to 255. Use gdal_translate with the -scale switch to appropriately scale your data into the new data type's ranges.
Different types require different amounts of storage for each value. A Byte, as we'd expect, takes up one byte per cell, while a Float32 takes up four bytes (32 bits, 8 bits to a byte). This means a Float32 raster will be about four times as large as a Byte raster of the same dimensions, limiting your storage options and increasing your file read times. This also means processing the same area would require four times the allocated RAM, which could lead to memory size issues on larger data sets and 32-bit environments.
Some programs can't handle certain data types. For example, I wanted to make a custom hillshade represented by brightness values of 0-255 with a NoData value, so I specified -ot uint16 and a NoData value of 256. However, when I tried to load it into GIMP to play around with it as a raster image, it failed because GIMP can't handle unit16 data (Note: The default gdaldem hillshade command creates a byte-type raster, using 0 as NoData and 1-255 as the brightness values. My insistence on using the range 0-255 for brightness values was foolishness.).

Creation Options: -co <option>

Creation options allow you to set certain parameters that are specific to each raster format. For example, the GeoTiff driver gives you the ability to compress your image (-co compress=<type>) or to create an accompanying .tfw world file (-co tfw=yes). Multiple -co switches can be used in one command, like this set that creates a GeoTiff with jpeg compression, a YcBcR color scheme, and the ability to create a file bigger than 4GB: -co compress=jpeg -co photometric=ycbcr -co bigtiff=yes.

The creation options available to each driver are listed on the driver's documentation page, which are linked from the format list above. For example, the GeoTiff creation options can be found about halfway through the GeoTiff format documentation.

The documentation usually shows all the creation options in upper case. This isn't absolutely necessary, however. I usually just use lower case for all the options (like you see in these examples) because it's easier to type.

Sidetrack: Configuration Options

There's a bit of a grey area between the definitions of creation options and configuration options (https://trac.osgeo.org/gdal/wiki/ConfigOptions). Generally speaking, creation options change the way data is stored in the file you create while configuration options affect how GDAL functions to create that file. You can specify configuration options as part of a command with the --config switch, like this --config gdal_cachemax 2048. Note that creation options don't have '=' between the option and its value.

Note: The distinction between creation and configuration options gets even grayer in gdaladdo, where you set the compression options for your overviews via the --config switch.

Putting It All Together
Let's say we want to change an Erdas Imagine (.img) file to a jpeg-compressed GeoTiff:
gdal_translate -of GTiff -co compress=jpeg -co photometric=ycbcr -co tiled=yes -co bigtiff=yes in.img out.tif

Coming Up Next

Next post we'll look at the compression options available with the GeoTiff driver.

Notes
More on ECW: http://www.gdal.org/frmt_ecw.html, https://trac.osgeo.org/gdal/wiki/ECW, and https://gis.stackexchange.com/questions/154929/how-to-install-gdal-with-ecw-support-on-windows?rq=1

Tuesday, December 26, 2017

A Functional Introduction to GDAL, Part 0: Installing GDAL

Welcome to GDAL!

This is the first of what should be several posts on GDAL, the bread-and-butter library and toolset for playing with raster data. There are several introductions to GDAL on the web already that eloquently explain what GDAL is and why it is useful—for example, Robert Simmon's 4-part series on Medium.

This series will be geared towards working GIS professionals: people who understand projections, have used the raster tools in ArcMap or QGIS, but may be a little hesitant to dip into the command line. It will be geared around the tools available and how they're used for common tasks—an elaboration on the official docs with an emphasis on practical examples.

Installing GDAL On Windows

Installing GDAL used to be either really simple or really annoying, depending on your computing environment. GDAL is written and actively maintained by developers: people who are comfortable compiling from source and managing environment variables and installations. As such, it works great in their world but requires a little massaging for everyday GISers, especially those of us coming from a Windows-centric ArcMap world.

Then came Conda.

Conda is amazing. Conda is package and environment management for python and python-related packages. Conda makes installing and running GDAL easy.

So, how do you get it?

First, download and install Anaconda. The Anaconda Platform is the support structure for the Conda command line tools and includes helpful utilities and installers.

Next, we'll use the Anaconda Navigator to create a new environment. To do this, go to the Environments pane and select "Create" at the bottom. Give it a name—I call mine "gdal" for clarity's sake—and select the latest 3.x version of Python (unless you've got a good reason to use 2.7). Click "Create" to finish it off.

Note: It doesn't matter what python is currently installed on your computer (like if you've got python 2.7 from ArcGIS Desktop) because Conda creates its own sandbox for each environment. These sandboxes are completely separate from each other and from any other Python installation on your computer—that's the beauty of Conda!

Next, we're going to add the conda-forge repository ("repo"), as it usually has a more up-to-date version of GDAL. To do this, click on the "Channels" button and then select "Add." Type "conda-forge" into the text box and hit Enter.

Now, you can search for and add the gdal packages to your new environment. With your environment selected, change the dropdown menu at the top of the packages pane to "All" and type "gdal" (w/o quotes) into the search box. Select the checkbox next to the entry for gdal when it appears and click "Apply" in the bottom right corner. It will work out what other packages are needed and install everything.

Another Note: Getting the latest version of GDAL gets you the latest updates and bug fixes. For example, GDAL 2.1.3 has a bug that does not allow you to set the altitude parameter when calling DEMProcessing() via the Python bindings (see this bug report). This bug was fixed in 2.2.0.

More Notes! As of this writing (end of 2017), I've noticed that installing the SciPy libraries with the Anaconda Navigator downgrades GDAL to 2.1.3. After you install SciPy, you should see a little green arrow in the checkbox for GDAL in the installed packages view, indicating an update is available. Go ahead and update GDAL back to 2.1.3 by clicking this green arrow and then selecting "Apply" at the bottom.

Switching Environment
To play in your new environment, navigate to the Anaconda folder group in the Start Menu and select Anaconda Prompt. This opens a new command prompt that is aware of our Conda environments.

Like I mentioned, Conda gives us a separate sandbox for each environment we create. We need to switch to our new environment by typing activate gdal (replacing gdal with whatever environment name you chose earlier). After a pause, the command prompt will change to something like (gdal) C:\Users\jadams. This means that you're now in the gdal sandbox.

To verify GDAL is installed and working properly, type gdalinfo --version and hit Enter. You should get output that looks like GDAL 2.2.2, released 2017/09/15.

If you create other environments, you can switch between them by activating the other environment with activate foo, just like you'd do to activate your first environment.

Conda on Other OSes

macOS
Installation on macOS is similar to Windows, and Anaconda Navigator works the same in both OSes. The only difference is that we'll use the standard Terminal app instead of the Anaconda Prompt. In addition, we have to type source activate gdal instead of just activate gdal. Beyond that, enjoy the flexibility of bash (or csh, or whatever shell you're using).

Linux
Conda on Linux requires more CLI-based magic than Windows or MacOS. I've not tried it myself, but a quick search for "install conda ubuntu" gives us these instructions, and it should work just the same for other distros as well.

Linux also shares the same source activate requirement as macOS to switch between environments.

Coming Up Next

The next post will focus on some common file creation considerations that you'll use whenever your GDAL commands result in a new file.

Resources

GDAL Homepage: http://www.gdal.org
Conda User Guide: https://conda.io/docs/user-guide/index.html
Anaconda User Guide: https://docs.anaconda.com/anaconda/user-guide/
GISInterals pre-built GDAL binaries for Windows: http://www.gisinternals.com/release.php
Alternate Windows Installation: https://sandbox.idre.ucla.edu/sandbox/tutorials/installing-gdal-for-windows

Thursday, August 10, 2017

Using SDE Feature Classes Inside a Python Geoprocessing Service

I finally had a breakthrough today at work on a problem I've been working on off-and-on for a while now at work, and I want to share it with you (and write it down so I remember it...).

Like a lot of places, we use a backend SDE to hold the data that we work on internally and expose publicly through map services. I've been trying to create a Geoprocessing Service based on a Python script tool to do some things we need done. Within this script, I directly access the parcels feature class in our SDE that has tens of thousands of features.

The script tool runs just fine from within ArcMap, with the path to the feature class either hardcoded in or provided as a tool parameter:

parcels_fc = "Database Connections\\connection.sde\\Cadastral\\parcels"
or
parcels_fc = arcpy.GetParameterAsText[4]

The Problem
However, sharing the results to our ArcGIS Server as a GP service never worked out. If I hardcoded the path, it would upload but the service would always fail when I'd try to run it in a web app. If I got it through a parameter and selected "Constant Value" for that parameter when sharing the service, the server would say that it couldn't find the data in its data store.

I tried all sorts of different permutations to try to get this to work, to get the server to recognize that it had the SDE connections registered in its data store, but to no effect. A "Failed to consolidate data" error was common when trying to stage the service prior to uploading to the server.

Solution
Ok, enough background, what fixed it? The key was to put a copy of the SDE connection file in a place where it would get copied into the server's data store and to use a relative path to access it from within the script.

First, I created a folder in the same directory as the script (which we'll call "path" for this example) and copied the SDE connection file into it. Next, I used the sample code from the Esri documentation (scroll down to "Relative Paths to Folders") to create something like this:
script_path = sys.path[0] # where the script lives
folder_path = os.path.join(script_path, "path")
parcels_fc = os.path.join(folder_path, "Cadastral\\parcels")

In the script tool properties, make sure "Store absolute paths as relative paths" is checked.

And voila. When I share the results as a GP service, the server picks up the "path" folder and copies it to its data store along with the connection file inside of it. The script knows to look for the connection file in the folder "path" that resides in the same directory as the script, then uses that connection to access the SDE and do whatever magic it is that you've told it to do.

Resources:
https://gis.stackexchange.com/questions/10503/creating-a-gp-script-to-update-a-table-in-sde
https://gis.stackexchange.com/search?q=geoprocessing+sde

Saturday, June 24, 2017

Mental Maps and The Shadows of Human Impact

I like maps. I like to stare at them, try to figure out what their story is. I especially like maps of the physical environment—I've spent hours on Google Maps' terrain view exploring different areas, trying to read the earth, what the bare ground is telling us.

But even with Google Maps, the roads are still there, still distracting us, still saying "look what humans have created! Look at this mighty road we built in this impossible canyon!" So I went to make my own terrain maps, free of the influence of humans.

My first attempt actually ended up telling more of humankind's story than I intended.

Looking for good-but-not-overwhelmingly-detalied elevation data, I downloaded the 5-meter auto-correlated DEM data for the Salt Lake valley from the Utah AGRC. Auto-correlated elevation models are created examining the differences in stereo image pairs (see https://gis.stackexchange.com/a/61993) and thus show various parts of the built environment. What this means is that my attempt to scrub away humankind was beautifully flawed and ended up showing some really cool artifacts.

Let's start with freeways, because every post about the evils of human influence has to talk about freeways, right? This is an overpass. Even without showing the roads, the massive amount of earthmoving that's been done to raise one high-volume stream of metal, cloth, and human beings over another stream is a little stupefying.

Even old-school grid systems, the darling child of walkability nerds and new urbanists, get in on the act. They're easier to see when they're cut into a hillside and the streets square off the lots to accommodate our discomfort of weird, unordered shapes.

This is a wonderfully disjointed bit of geomorphology. Freeway planners of the 50s and 60s took advantage of a natural stream channel/canyon to build a freeway without having to tear down homes. The steep canyon walls gradually give way to the valley floor, however, and are replaced by the flat bed and smooth, flowing, engineered curves of the Interstate Freeway system.

For all our love of attacking modern human's impact on the environment, let's not forget that something as Utah-Pioneer-'Merica!-ish as farming shows up as well. This is one of the last center-pivot irrigation fields I could find in the valley, though the latest satellite imagery shows it's been dug up in preparation for some sort of development.

File this one in the "What the ...?" department. A look at the satellite imagery shows this massive, angular cut into the hillside was made for... a park. A park?! I'm sure there's a story and reason behind it, but I can't for the life of me figure it out.

————————————————————

But the point of this post isn't to tell this micro story, the story of how we've shaped the earth. No, the story I want to tell is this one.

Full-size link (8mb)

Look familiar? This is the Salt Lake valley. I've always wondered what it looked like before 1847, when it the only human inhabitants were the Northwestern Shoshone. Like I said above, it's not perfect, but it still gives us a much better picture of what the land beneath our tires looks like.

The human mind is amazing at creating mental maps of areas. We know the routes that are familiar, the topological links between our house, the roads, the freeway, and our office or favorite shopping center. We know the canyons and tall mountains are "east," that Ogden is "north" and Provo is "south." We recognize the Bingham Canyon Mine slag piles as "west."

But this plays tricks on us as well. For years, I always though of going downtown as "going up" somehow. Looking at the elevation color (AKA hypsometric tint), downtown is lower than most of the valley! My mental map doesn't match reality, and it doesn't even match the word "downtown." But the "downtown is up" concept was so engrained in my brain that in Portland I always thought I was going north when I got off the train at Pioneer Square and walked uphill to the PSU campus... on the south end of downtown. But I was going up!

What I like about this map is that except for the few highly-identifiable freeway cuts and raises, you can't use any of your normal mental map clues to figure out where you are. You have to stop thinking about a home, the road, or an arbitrary address grid coordinate, and start thinking about river channels, glacial moraines, and canyons.

After looking at this map in detail, someone told me they couldn't figure out where was where. That's the point.

As Yoda once said, "you must unlearn what you have learned." Let go of your east side vs west side mentality, your downtown vs suburbs bias, your industrial wastelands vs green-grass dreamworld. Release your a priori knowledge, and see what you can find.

Read the story of the valley.

————————————————————

I didn't start intending to write about such abstract ideas as mental maps and such conscientious topics as humankind's impact on the world. No, I wanted to share the discoveries I've made in this map of the place I've lived for over twenty years of my life.

Looking at this map, I realize I barely know it.

I meant to write about the cartographic tools and techniques I used, the thrill of getting something to work after trying half a dozen different ways. Someday I'll write that, and the many other techniques I've discovered in the two years since I first wrote this on my personal blog. But for now, I'll keep exploring.