Notes on Learning Geospatial Analysis with Python, 2nd Ed.

By Joel Lawhead, Packt Publishing, Dec. 2015; ISBN 9781783552429

Chapter 1: Learning Geospatial Analysis with Python

GIS concepts

Thematic maps - portray a specific theme
- Minimal geographic features to avoid distracting from the theme
- Mostly include political boundaries but avoid navigational marks
- Use landmarks to orient the reader
- Common uses:
  - Visualizaing health issues
  - Election results
  - Environmental phenomena
- Tell a story, serve a purpose
- Still only models / narratives about reality
Spatial databases
- Organized collection of information
- May be a DBMS, or not
- Most DBMS's typically contain scalar and blob data
- Spatial / geodatabases extend RDBMSs to store / query data in two or three dimensional space
- Some account for time series data as well
Spatial indexing
- Organizing geospatial data for faster retrieval
Metadata
- Provides traceability for source and history of datasets
- Summarizes technical details of the dataset
- Several possible standards:
  - ISO 19115-1, which gives hundreds of fields to describe a geospatial dataset
  - ISO 19115-2 gives extensions for imagery and gridded data
- Example fields:
  - spatial representation
  - temporal extent
  - lineage
- Open Geospatial Consortium (OGC) created the Catalog Service for the Web (CSW) to manage metadata
- pycsw library implements the CSW standard
Map projections
- Basic problem is always 3D projected onto 2D
- Choice is always a compromise about what to preserve and what to give up
- Projections are typically a set of 40+ parameters
- Format is either XML or Well-Known-Text (WKT), that defines the transform
- Intl. Association of Oil and Gas Producers (IOGP) maintains a registry of most common projections
- That was formerly European Petroleum Survey Group (EPSG)
- Entries are still called EPSG codes
- 5000 plus entries in the registry
- Previously projections were a huge hassle in terms of data storage and maintenance
- Many data formats for geospatial don't actually include the projection info, so it has to be in metadata, which is risky from a management perspective
- Thankfully high speed transfer and cheap storage addresses a lot of that
- Most data formats now have metadata formats defining projection, or it lives in the file header
- There are global basemaps, that give you common projections like Google Mercator/Web Mercator
- Web Mercator is EPSG:3857 (or deprecated EPSG:900913)
- Modern software can often reproject on the fly
- Closely related issue: geodetic datums
- A datum is a model of the earth's surface that matches the location of features on Earth to a coordinate system. Most common is probably WGS 84, used by GPS devices
Rendering
- Geodata vector data exists as 2- or 3-tuples of (x,y[,z])
- Geodata is stored in a coordinate system representing a grid overlaid on the earth (3 dimensional)
- Screen coordinates / pixel coordinates are a 2d grid
- XY world coordinates map to pixel coordinates pretty simply by a scaling algorithm
- The z coordinate has to go through a transform to be mapped onto the 2d plane
- For remote sensing / raster data, challenge is file size and compression
- Lossless compression is possible, though rendering is still computationally expensive

Remote Sensing Concepts

Sources of raster geodata can include ground based radar, laser range finders, specialized devices and detectors for gas, radiation, EM, etc.
For purposes of this text, looking at remote sensing platforms that capture large amounts of Earth data, including Earth imaging systems, some elevation data, and some weather systems
Images as data
- Raster data is captured as square tiles
- If it's multi-spectral, it will contain multiple bands in arrays
- An array item in one of these 2d arrays represents both space and some other value, reflectance or whatever
Remote sensing and color
- Visible light data in RGB can be captured and displayed per pixel
- Non-visible EM can be shown in false color

Common vector GIS concepts

Data structures
- Vector data has at minimum x/y values per point, sometimes a z value
- Coordinates are combined to form points, lines, and polygons
- Typically represents elevation better than raster data
- It's more costly to collect than raster data
- Two important concepts for vector data structures:
  - Bounding box / min bounding box - smallest rectangle bounding all points in a set
  - Convex hull - smallest convex polygon bounding all points in a set
- the bounding box always contains the convex hull
Buffer
- Buffer operations can apply to points, lines, and polygons
- Buffers create a polygon around the original object at a fixed distance
- Buffers are used for proximity analysis
Dissolve
- Creating one polygon out of two or more adjacent ones
Generalize
- Reducing the number of points being used to define an object, for efficiency
Intersect
- Discover whether one part of a feature intersects with one or more additional features
Merge
- Combines two or more non-overlapping shapes into a multishape object
- Multishape objects maintain separate geometries but are treated as a single feature with a single attribute set
Point in Polygon
- Check to see whether a point falls inside or outside a given polygon
- Most commonly done with ray casting
  - Check to see if the point is on the polygon boundary
  - Draw a line from the point in a single direction
  - Count the number of times the line crosses the polygon boundary until it reaches the bounding box of the polygon
  - If the number is odd, the point is inside, if even, the point is outside
Union
- Combine two or more overlapping polygons in a single shape
Join
- SQL join, combines tabular data
- Spatial join defined by a spatial db extension will combine features similar to an SQL join, but does so based on feature proximity. For instance, you could derive county name from features inside a county.
Geospatial rules about polygons
- These are rules of thumb about how geospatial polygons differ from mathematical polygons:
  - Polygons must have at least four points, first and last must be the same
  - A polygon boundary should not overlap itself
  - Polygons in a data layer should not overlap
  - A polygon in a layer inside another polygon is a hole in the underlying polygon
- Different geospatial software resolves those rules differently
- Best practice is to make sure your polygons obey those rules
- By definition a polygon is a closed shape. Some software will error if you haven't explicitly duplicated the first point as the last point in the polygon's dataset, some will close it automatically without complaining.
- How the polygon is defined is dictated by the data format

Common raster data concepts

Band math
- Multidimensional array math, typically matrix ops from linear algebra
Change detection
- Highlight changes between images taken at different times
Histogram
- Statistical distribution of values in the dataset
- Used for analysis and operations like contrast increases
Feature extraction
- Manual or automatic digitization of features in an image to form vector data
Supervised classification
Unsupervised classification

Creating the simplest possible Python GIS

Chapter 2: Geospatial Data

Overview of common data formats

Typical data sources may be:
- Spreadsheets, CSV, TSV
- Geotagged photos
- Lightweight binary points, lines, polygons
- Multi-GB satellite or aerial rasters
- Elevation data like grids, point clouds, integer based images
- XML
- JSON
- Databases
- Web services
Before 2004 it was hard to get geodata
Then Google Maps and Google Earth came out, Microsoft launced TerraServer, and Open Geospatial Consortium developed Web Map Service to 1.3.0.
Esri also released v9 of ArcGIS server
These all drove things towards common basemaps and Googles web map tiling model
Gmaps was served dynamically with AJAX, and scaled really solidly
Also offered people the ability to mashup data with an API
There are distributed geospatial layers like OpenLayers, that gives an API richer than Gmaps
Complimentary to that is OpenStreetMap, which has global, street-level vector data
Incentives changed, siloed data started opening up some
Analysts benefited:
- Data distribution started being standardized to Mercator
- Google standardized datum on WGS 84, as does GPS

Data structures

Common traits of geodata across formats
- Geolocation
- Subject information

Spatial indexing

Indexing algorithms
- Quadtree index
  - series of different algorithms based on a common theme
  - Each node in the index has four children
  - Child nodes are typically square or rectangular
  - When a node contains a specified number of features, it splits if additional features are added
  - Dividing space into nested squares speeds up spatial search
  - Software only has to handle five points at a time and use gt/lt comparisons to check for point inclusion in a node
  - Most commonly found as file based indices
- R-tree index
  - More sophisticated than quadtrees
  - Handle 3D data
  - Optimized to store the index in a way compatible with how a database uses disk / memory
  - Nearby objects are aggregated into hierarchical nodes that are balanced at each level
  - Bounding boxes of objects may overlap across nodes (unlike quadtree)
  - Due to complexity / db integration, these are typically found in databases, not files
- Grids
  - Spatial indexes may use integer grids
  - Coordinates are typically floating point with a fixed degree of precision
  - Float computations are expensive compared to integers
  - Indexed search is about initial eliminating possibilities that don't require precision (clearly outside the bounds of hte query)
  - Most spatial indexing algorithms therefore map floats to a fixed size integer grid for initial search, then switch to full resolution for the narrowed dataset
  - Grid sizes can be wildly variant

Overviews

Most overview data is raster
They're resampled, lower resolution versions that give thumbnail or preload views at different scales
Also known as pyramids, 'pyramiding' an image is creating one
Usually preprocessed and stored with the full resolution data, embedded or in a separate file
Vector data also has a concept of overviews, typically created on the fly because vectors are scalable
Sometimes vector data is rasterized by converting to a thumbnail image, stored with or embedded in the image header

Metadata

Most data formats contain the footprint or bounding box of the data on the earth
Detailed metadata is typically stored outside the data file
Formats include
- US Federal Geographic Data Committee (FGDC) Content Standard for Digitial Geospatial Metadata (CSDGM)
- ISO
- Newer EU format, Infrastructure for Spatial Information in the European Community (INSPIRE)

File Structure

Human readable formats are easy to poke around in
Binary data you can get at with the struct module or a third party library

Example of parsing the bounding box out of a shapefile:

import struct
bb = {}
with open("hancock.shp", "rb") as f:
    f.seek(36)
    bb['min_lon'] = struct.unpack("<d", f.read(8))
    bb['min_lat'] = struct.unpack("<d", f.read(8))
    bb['max_lon'] = struct.unpack("<d", f.read(8))
    bb['max_lat'] = struct.unpack("<d", f.read(8))

Vector data

OGC has 16+ formats for vector data
Stores only geometric primitives, as associated points
Geospatial vector data stores Earth-based points, not screen based
Typically linked to info about the object being represented
Geospatial vector data typically contains no styling information
Geospatial vectors typically include very primitive geometries for points, lines, and polygons, no curves
You can store vector data in human readable formats like CSV, text, GeoJSON, XML
In the early 90's data was all binary, then switched to XML. File sizes are greater, but they're more portable / genericized.
Open source library OGR has 86+ supported vector formats
Commercial counterpart, Safe Software's Feature Manipulation Engine (FME) has 188+

Shapefiles

Most ubiquitous format, Esri shapefile
Released the spec as an open format in 1998
Basically all GIS software reads it
OGR works on it, as do the modules Shapely and Fiona (based on OGR)
The format consists of multiple files (3 at minimum, up to 15)
Required file types within the standard:
- .shp
  - Purpose: shapefile, contains the geometry
  - Some software will accept only .shp without the .shx or .dbf
- .shx
  - Purpose: shape index file; fixed sized record index referencing the geometry for fast access
  - Meaningless without the .shp
- .dbf
  - Purpose: database file, holds geometry attributes
  - Can be accessed separately sometimes, as the format predates .shp. Openable by spreadsheets.
Optional file types within the standard:
- .sbn
  - Purpose: spatial bin file, the shapefile spatial index
  - Has bounding boxes of features mapped to a 256x256 integer grid
- .sbx
  - Purpose: Fixed sized record index for the .sbn file
  - Ordered record index of a spatial index
- .prj
  - Purpose: Map projection info stored in well known text format
  - May also accompany raster data, needed for on the fly reprojection
- .fbn
  - Purpose: Spatial index of read only features
  - Very rarely used
- .fbx
  - Purpose: Fixed-size record index of the .fbn spatial index
  - Very rarely used
- .ixs
  - Purpose: Geocoding index
  - Common in geocoding applications including driving directions
- .mxs
  - Purpose: Another type of geocoding index
  - Less common than .ixs
- .ain
  - Purpose: Attribute index
  - Mostly legacy format, rarely used now
- .aih
  - Purpose: Attribute index
  - Accompanies .ain files
- .qix
  - Purpose: Quadtree index
  - Spatial index created by open source community because Esri .sbn and .sbx files were undocumented
- .atx
  - Purpose: Attribute index
  - More recent, Esri specific attribute index for fast queries
- .shp.xml
  - Purpse: Metadata
  - Geospatial metadata container, can be any of multiple XML standards
- .cpg
  - Purpose: Code page file for .dbf
  - Used for internationalization of the .dbf files
If you want to rename a shapefile, you must rename all associated files as well
Records in these files are not numbered
Records include geometry, the .shx index record, and the .dbf record
Those are stored in a fixed order
Records are numbered dynamically on opening them but the numbers are not saved
Deleting a record will bump its number next time it is opened
Don't base anything on record number unless you create a new, secondary identifier in the .dbf file and assign each record a number.
If you edit shapefiles, do it with software that manages the file(s) as a shared dataset.

CAD files

CAD formats are mostly from Autodesk via AutoCAD
Two typically seen formats are
- Drawing Exchange Format (DXF)
- Drawing (DWG, autocad native)
Features of these file types:
- Curves
- Surfaces
- 3d solids
- Text rendered as objects
- Text styling
- Viewport configuration
If you encounter CAD data, best to ask if you can get shapefiles

Tag-based and markup-based formats

Typically XML
May also be well-known text (WKT) for .prj
XML formats include
- Keyhole Markup Language (KML)
- OpenStreetMap (OSM)
- Garmin GPX for GPS data
- Open Geospatial Consortium's Geographic Markup Language (GML)
- That's the basis for the OGC Web Feature Service (WFS)
- GML has been largely superseded by KML and GeoJSON
XML files have more than just geometry, may have attributes and rendering instructions
KML is now a fully supported OGC standard
XML is attractive to geospatial analysts because:
- It's human readable
- Can be edited with a text editor
- Well supported by programming languages
- By definition easily extensible
Issues with XML:
- Inefficient for large data storage
- Can be cumbersome to edit
- Errors in datasets are common, most parsers deal with them poorly
SVG (scalable vector graphics) is a widely supported XML format
SVG is not a geographic format
WKT is an older OGC standard

Example WKT for WGS84 coordinate system:

GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.01745329251994238,
        AUTHORITY["EPSG","9122"]],
    AUTHORITY["EPSG","4326"]]

Standards codes can be quite long, EPSG created a numerical coding system to reference projections by shorthand

GeoJSON

Standard way to define geometry, attributes, bounding boxes, and projection information
More compact than XML, though less than binary formats
Key component of REST web APIs

Raster data

Raster datasets are two dimensional arrays
Can be stored as ASCII or BLOB in databases
Geospatial raster resolution is not dots per inch, it's ground distance per cell
May contain multiple bands, meaning that different wavelengths of light can be collected over the same area
Typical range is 3-7 bands, but can be several hundred
Can be viewed individually or swapped in and out as the RGB bands of an image
Can be recombined into a derivative single-band image, and recolored using a set number of classes representing like values
Raster data often shows up in scientific computing, which uses complex formats including Network Common Data Form (NetCDF), GRIB, and HDF5, which store entire data models
Raster data can come in a bunch of formats
Geospatial Data Abstraction Library (GDAL) includes 130+ raster formats

TIFF files

Tagged Image File Format, most common geospatial raster format
Flexible tagging system lets it store any type of data in a single file
Can have overview images, multiple bands, integer elevation data, basic metadta, internal compression, and a variety of other data typically stored in additional supporting files by other formats
Anyone can unofficially extend the format by adding tagged data to the file structure
May mean a "valid" TIFF does not work in some applications, if it has been extended beyond what that application recognizes
GeoTIFF defines how geospatial data is stored
May show up as .tiff, .tif, or gtif

JPEG, GIF, BMP, PNG

Common image formats, can be used for geospatial data
Typically rely on accompanying support files for georeferencing
JPEG is fairly common for geospatial data
JPEGs have a built in metadata tagging system, EXIF

Compressed formats

Geodata rasters tend to be stored using compression because they're real big
Latest open standard is JPEG 2000
That includes wavelet compression and georeferenced data
Multi-resolution Seamless Image Database (MrSID, .sid) and Enhanced Compression Wavelet (ECW, .ecw) are proprietary wavelet compression formats that come up in geospatial contexts
TIFF supports compression including Lempel-Ziv-Welch (LZW)
Compressed data is fine for part of a base map, but should not be used for remote sensing processing

ASCII grids

Common for elevation data
File format created by Esri, but now an unofficial standard, widely supported
Simple text file with x,y values as rows and columns
Spatial info for the raster is in a simple header
Not terribly efficient, but don't require any special libraries either
Often distributed as zip files
Headers contain:
- number of columns
- number of rows
- x-axis cell center coordinate | x-axis lower-left corner coordinate
- y-axis cell center coordinate | y-axis lower-left corner coordinate
- cell size in mapping units
- the no-data value (typically 9999)

World files

Simple text files that provide geospatial referencing info to any image externally for file formats with no native spatial info
Recognized by geospatial software due to naming convention
Most common way to name a world file is to use the raster file name and then alter the extension to remove the middle letter and add w to the end
Examples:
- World.jpg == World.jpw
- World.tif == World.tfw
- World.bmp == World.bpw
- World.png == World.pgw
- World.gif == World.gfw
File structure is simple, it's a six line text file:
- Line 1: Cell size along the x axis in ground units
- Line 2: Rotation on the y axis
- Line 3: Rotation on the x axis
- Line 4: Cell size along the y axis in ground units
- Line 5: Center x-coordinate of the upper left cell
- Line 6: Center y-coordinate of the upper left cell
Rotation is crucial because remote sensing images are often rotated due to teh data collection platform
Great tool when working with raster data in python

Point cloud data

Any data collected as the (x,y,z) location of a surface point based on some sort of focused energy return
You can get it from lasers, radar, acoustic soundings, or other waveform generators
Spacing between points is arbitrary and depends on the type and position of the collecting sensor
Book mostly concerned with LIDAR data and radar data
Radar point cloud data mostly comes from space missions, LIDAR is terrestrial or airborne
LIDAR uses laser range finding to model the world at high precision
LIDAR is a combination of the words light and radar, maybe Light Detection and Ranging
LIDAR sensors are high-speed, continuous, and have a wide field of view (often 360 from the sensor), so the data doesn't tend to have a regular footprint like other raster datasets
LIDAR datasets are point clouds because the data is a stream of 3 tuples, where the z value is the distance from the laser to the ranged object, and the x,y are the projected location of that object calculated from the location of the sensor.
Most common format for LIDAR is LIDAR Exchange Format (LAS)
Can be represented in many ways including a text file with one tuple per line
Can be used to create 2D elevation rasters, can be colorized

Web Services

Most common protocols are Web Map Service (WMS) that returns a rendered map image and Web Feature Service (WFS) that typically returns GML
Many WFS services can also return KML, JSON, zipped shapefiles, and other formats

Chapter 3: The Geospatial Technology Landscape

Most software, open source and commercial, derives from a few key packages
High level core capabilities for geospatial libraries:
- Data access
- Computational geometry (incl. data reprojection)
- Visualization
- Metadata tools
- Image processing for remote sensing (very fragmented category)
Most image processing software is based on the core data access libraries with custom image processing logic layered on top
Common examples of this type of software:
- Open Source Software Image Map (OSSIM)
- Geographic Resources Analysis Support System (GRASS)
- Orfeo ToolBox (OTB)
- ERDAS IMAGINE
- ENVI
Core libraries in use by most other packages:
- GDAL
- OGR
- GEOS
- PROJ.4

Data access

Data access libraries underpin everything else in GIS work
Accuracy and precision are incredibly important
Libraries must manage memory efficiently

GDAL

Geospatial Data Abstraction Library
Gives a single, abstract data model for raster data types
Consolidates data access libraries for different data formats
Gives a common read/write API
Abstraction of the GDAL dataset:
- GDAL Rasterband
  - Raster (0 to n bands)
    - Width in pixels
    - Height in lines
  - Data Type
    - Byte
    - UInt16 / Int16 / UInt32 / Int32
    - Float32 / Float 64
    - CInt16 / CInt32
    - CFloat32 / CFloat64
  - Block size
    - Data chunk read size
- Projection
  - Coordinate system
  - Georeferencing
    - Affine geo-transform
    - Ground control points
- Metadata
  - Arbitrary tagging system
    - Some default tags
    - XML storage ability
- Overviews
  - Freestanding bands
    - Reduced resolution
  - 0-n overview bands

OGR

OGR Simple Features Library is the vector companion to GDAL
At least partial support for 70+ vector formats
Originally stood for OpenGIS Simple Features Reference Implementation
Not actually a reference implementation
Licensed X11/MIT
Capabilities:
- Uniform vector data and modeling abstraction
- Vector data re-projection
- Vector data format conversion
- Attribute data filtering
- Basic geometry filtering including clipping and point-in-polygon testing
Architectural sections / objects in OGR
- Geometry
- Feature definition
- Feature
- Spatial reference
- Layer
- Data source
- Drivers
The Geometry object represents the data model for points, lines, linestrings, polygons, geometrycollections, multipolygons, multipoints, and multilinestrings
Feature object ties Geometry and Feature Definition info together
Spatial Reference contains an OGC Spatial Reference definition. What?
Layer represents features grouped as layers in a data source
Data Source is the file or database object accessed by OGR
Driver contains translators for the multiple underlying data formats
Works smoothly with one quirk: Layer concept is used even for data formats that only contain a single layer. Mild inconvenience.

Computational Geometry

Algorithms for ops on vector data
Most geospatial libraries are separate from graphics libraries because of geospatial coordinate systems
Screen coordinates are almost entirely positive, geo coordinates can be positive or negative across a meridian
Some features of OGR move beyond data access into computational geometry
Geospatial algorithms are pretty hard to implement from scratch.

PROJ.4 projection library

Created by Jerry Evenden at USGS in the 90s
Now a project of the Open Source Geospatial Foundation
Purpose is to transform data betwene thousands of coordinate systems
The math to do so is super complex, and nothing approaches PROJ.4
Uses a simple syntax capable of describing any projection
Used in virtually every major GIS package that does reprojection
Has its own command line tools
Also available through GDAL and OGR
Access it directly to reproject individual points

CGAL

Computational Geometry Algorithms Library
Originally from late 90s
Not specifically for geospatial analysis, but commonly used
Useful for operations like resizing a polygon

JTS

Java Topology Suite
Implements the Open Geospatial Consortium (OGC) Simple Features Specification for SQL
Has been ported to other languages
Has a great test program, JTS Test Builder, that gives you a GUI for testing individual functions without writing wrapper scripts
Lets you interactively test algorithms to verify data or just understand a process
Has not seen development in a long time

GEOS

Geometry Engine - Open Source
C++ port of the JTS library
Much larger impact on geospatial work than JTS
Lots of infrastructure exists for things like python bindings
Most common usage is via APIs
Capabilities:
- OGC Simple Features
- Geospatial predicate functions
- Intersects
- Touches
- Disjoint
- Crosses
- Within
- Contains
- Overlaps
- Equals
- Covers
- Geospatial Operations
- Union
- Distance
- Intersection
- Symmetric Difference
- Convex Hull
- Envelope
- Buffer
- Simplify
- Polygon assembly
- Polygon validation
- Area
- Length
- Spatial Indexing
- OGC well-known text and well-known binary IO
- C and C++ API
- Thread safety
Can be compiled with GDAL to give OGR all its capability

PostGIS

Most common spatial database
Module on top of PostgreSQL
Much of the power comes from GEOS library
Implements OGC Simple Features Specification for SQL
Allows you to execute both attribute and spatial queries against a dataset
Spatial operations are available via SQL functions
Feature set
- Geospatial geometry types including points, linestrings, polygons, multipoints, multilinestrings, multipolygons, and geometry collections
- Spatial functions for testing geometric relationships
- Spatial functions for deriving new geometries
- Spatial measurements including perimeter, length, area
- Spatial indexing using R-Trees
- A basic geospatial raster data type
- Topology data types
- US Geocoder based on TIGER census data
- New JSONB datatype that allows indexing/querying JSON and GeoJSON

Other spatially-enabled databases

PostGIS is the standard, but there are others to be aware of
Oracle Spatial and Graph
- geospatial data schema
- spatial indexing system based on R-Tree
- SQL API for geometric operations
- Spatial data tuning API for optimizing datasets
- Topology data model
- Network data model
- GeoRaster data type
- 3D data types including Triangulated Irregular Networks and LIDAR point clouds
- Geocoder
- Routing engine for driving direction queries
- OGC compliance
ArcSDE, Esri's Spatial Data Engine
- Rolled into ArcGIS Server
- Mostly DB independent, supports multiple DB backends
Microsoft SQLServer
MySQL

SpatiaLite

Extension to SQLite
Spatial data types and indexing are in SQLite
This adds OGC Simple Features Specification compliance and map projections

Routing

Niche area of computational geometry
Main contenders for dealing with it are Esri Network Analyst and pgRouting engine for PostGIS

Files

NotesOn_LearningGeospatialAnalysisWithPython_2ed.md

Latest commit

History

NotesOn_LearningGeospatialAnalysisWithPython_2ed.md

File metadata and controls

Notes on Learning Geospatial Analysis with Python, 2nd Ed.

Chapter 1: Learning Geospatial Analysis with Python

GIS concepts

Remote Sensing Concepts

Common vector GIS concepts

Common raster data concepts

Creating the simplest possible Python GIS

Chapter 2: Geospatial Data

Overview of common data formats

Data structures

Spatial indexing

Overviews

Metadata

File Structure

Vector data

Shapefiles

CAD files

Tag-based and markup-based formats

GeoJSON

Raster data

TIFF files

JPEG, GIF, BMP, PNG

Compressed formats

ASCII grids

World files

Point cloud data

Web Services

Chapter 3: The Geospatial Technology Landscape

Data access

GDAL

OGR

Computational Geometry

PROJ.4 projection library

CGAL

JTS

GEOS

PostGIS

Other spatially-enabled databases

SpatiaLite

Routing

Desktop tools (including visualization)