NOTEBOOK ON SPATIAL DATA ANALYSIS

NOTE: To cite this material use:
SMITH, T.E., (2014) Notebook on Spatial Data Analysis [online] https://www.seas.upenn.edu/~tesmith/NOTEBOOK/index.html

 

INTRODUCTION  

 

I. SPATIAL POINT PATTERN ANALYSIS

 

    1. Examples of Point Patterns
        1.1  Clustering versus Uniformity
        1.2  Comparisons between Point Patterns

    2. Complete Spatial Randomness
 
      2.1  Spatial Laplace Principle

        2.2  Complete Spatial Randomness
       
2.3  Poisson Approximation
        2.4  Generalized Spatial Randomness
        2.5  Spatial Stationarity
   
    3. Testing Spatial Randomness

        3.1  Quadrat Method
        3.2  Nearest-Neighbor Methods
               3.2.1  Nearest-Neighbor Distribution under CSR
               3.2.2  Clark-Evens Test
        3.3  Redwood Seedling Example
               3.3.1  Analysis of Redwood Seedlings using JMPIN
               3.3.2  Analysis of Redwood Seedlings using MATLAB
        3.4  Bodmin Tors Example
        3.5  A Direct Monte Carlo Test of CSR

    4.  4. K-Function Analysis of Point Patterns

       
4.1  Wolf-Pack Example
        4.2  K-Function  Representations
        4.3  Estimation of K-Functions
        4.4  Testing the CSR Hypothesis
        4.5  Bodmin Tors Example
        4.6  Monte Carlo Testing Procedures
               4.6.1  Simulation Envelopes
               4.6.2  Full P-Value Approach
        4.7  Nonhomogeneous CSR Hypotheses
               4.7.1  Housing Abandonment Example
               4.7.2  Monte Carlo Tests of Hypotheses
               4.7.3  Lung Cancer Example
         4.8  Nonhomogeneous CSR Hypotheses
               4.8.1  Construction of Local K-Functions
               4.8.2  Local Tests of Homogeneous CSR Hypotheses
               4.8.3  Local Tests of Nonhomogeneous CSR Hypotheses

    5.  Comparative Analyses of Point Patterns

        
5.1  Forest Example
         5.2  Cross K-Functions
         5.3  Estimation of Cross K-Functions
         5.4  Spatial Independence Hypothesis
         5.5  Random-Shift Approach to Spatial Independence
               5.5.1  Spatial Independence Hypothesis for Random Shifts
               5.5.2  Problem of Edge Effects
               5.5.3  Random Shift Test
               5.5.4  Application to the Forest Example
         5.6  Random-Labeling Approach to Spatial Independence
               5.6.1  Spatial Indistinguishability Hypothesis
               5.6.2  Random Labeling Test
               5.6 3  Application to the Forest Example
         5.7  Analysis of Spatial Similarity
               5.7.1  Spatial Similarity Test
               5.7.2  Application to the Forest Example
         5.8  Larynx and Lung Cancer Example
               5.8.1  Overall Comparison of the Larynx and Lung Cancer Populations
               5.8.2  Local Comparison in the Vacinity of the Incinerator
               5.8.3  Local Cluster Analysis of Larynx Cases

    6.  Space-Time Point Processes

        
6.1  Space-Time Clustering
         6.2  Space-Time K-Functions
         6.3  Temporal Indistinguishability Hypothesis
         6.4  Random Labeling Test
         6.5  Application to the Lymphoma Example

      APPENDIX TO PART I
 

         A1.1.  Poisson Approximation of the Binomial

         A1.2.  Distributional Properties of Nearest-Neighbor Distances under CSR

         A1.3.  Distribution of Skellam's Statistic under CSR

         A1.4.  Effects of Postively Dependent Nearest-Neighbor Samples

          A1.5.  The Point-in-Polygon Procedure

         A1.6.  A Derivation of Ripley's Correction

         A1.7.  An Alternative Derivation of P-Values for K-Functions

         A1.8.  A Grid Plot Procedure in MATLAB

 

II. CONTINUOUS SPATIAL DATA ANALYSIS


  1. Overview of Spatial Stochastic Processes


        1.1  Standard Notation
        1.2  Basic Modeling Framework

  2. Examples of Continuous Spatial Data


        2.1  Rainfall in the Sudan

        2.2  Spatial Concentration of PCBs

  3. Spatially-Dependent Random Effects


        3.1  Random Effects at a Single Location    

               3.1.1   Standardized Random Variables
               3.1.2   Normal Distribution

               3.1.3   Central Limit Theorems

               3.1.4   CLT for the Sample Mean
        3.2  Multi-Location Random Effects  

               3.2.1   Multivariate Normal Distribution

               3.2.2   Linear Invariance Property

               3.2.3  Multivariate Central Limit Theorem

        3.3  Spatial Stationarity  

               3.3.1   Example: Measuring Ocean Depths

               3.3.2   Covariance Stationarity

               3.3.3   Covariograms and Correlograms

  
  4. Variograms


        4.1  Expected Squared Differences

        4.2  The Standard Model of Spatial Dependence

        4.3  Non-Standard Spatial Dependence   
        4.4  Pure Spatial Dependence

        4.5  The Combined Model   

        4.6  Explicit Models of Variograms   
              4.6.1  The Spherical Model

              4.6.2  The Exponential Model   

              4.6.3  The Wave Model  
        4.7  Fitting Variogram Models to Data    
              4.7.1  Empirical Variograms

              4.7.2  Least-Squares Fitting Procedure

        4.8  The Constant-Mean Model

        4.9  Example: Nickel Deposits on Vanvouver Island   
              4.9.1  Empirical Variogram Estimation

              4.9.2  Fitting a Spherical Variogram

      4.10  Variograms versus Covariograms
               4.10.1  Biasedness of the Standard Covariance Estimator

               4.10.2  Unbiasedness of Empirical Variogram for Exact-Distance Samples

               4.10.3  Approximate Unbiasedness of General Empirical Variograms

  
  5. Spatial Interpolation Models


        5.1  A Simple Example of Spatial Interpolation

       5.2  Kernel Smoothing Models

        5.3  Local Polynomial Models

        5.4  Radial Basis Function Models

        5.5  Spline Models

        5.6  A Comparison of Models using the Nickel Data

6. Simple Spatial Prediction Models


        6.1  An Overview of Kriging Models

                   6.1.1  Best Linear Unbiased Predictors

                   6.1.2  Model Comparisons

          6.2  The Simple Kriging Model

                   6.2.1  Simple Kriging with One Predictor

                   6.2.2  Simple Kriging with Many Predictors

                   6.2.3  Interpretation of Prediction Weights

                   6.2.4  Construction of Prediction Intervals

                   6.2.5  Implementation of Simple Kriging Models

                   6.2.6  An Example of Simple Kriging

        6.3  The Ordinary Kriging Model

                   6.3.1  Best Linear Unbiased Estimation of the Mean

                    6.3.2  Best Linear Unbiased Predictor of Y

                    6.3.3  Implementation of Ordinary Kriging

                    6.3.4  An Example of Ordinary Kriging

        6.4  Selection of Prediction Sets by Cross Validation

                   6.4.1  Log-Nickel Example

                    6.4.2  A Simulated Example

7. General Spatial Prediction Models


        7.1  The General Linear Regression Models

                   7.1.1  Generalized Least Squares Estimation

                   7.1.2  Best Linear Unbiasedness Property

                7.1.3  Regression Consequences of Spatially Dependent

                          Random Effects.

        7.2  The Universal Kriging Model

                   7.2.1  Best Linear Unbiased Prediction

                   7.2.2  Standard Error of Predictions

                   7.2.3  Implementation of Univesal Kriging

        7.3  Geostatistical Regression and Kriging

                   7.3.1  Iterative Estimation Procedure

                   7.3.2  Implementation of Geo-Regression

                   7.3.3  Implementation of Geo-Kriging

                 7.3.4  Cobalt Example of Geo-Regression

                 7.3.5  Venice Example of Geo-Regression and Geo-Kriging

APPENDIX TO PART II

         A2.1.  Covariograms for Sums of Independent Spatial Processes

           A2.3.  Expectation of the Sample Estimator under Sample Dependence

           A2.3.  A Bound on the Binning Bias of Empirical Variogram Estimators

           A2.4.   Some Basic Vector Geometry

           A2.5.  Differentiation of Functions

           A2.6.  Gradient Vectors

           A2.7.  Unconstrained Optimization of Smooth Functions

                      7.1  First-Order Conditions

                      7.2  Second-Order Conditions

                      7.3  Application to Ordinary Least Squares Estimation

           A2.8.  Constrained Optimization of Smooth Functions

                      8.1  Minimization with a Single Constraint

                      8.2  Minimization with Multiple Constraints

                      8.3  Solution for Universal Kriging

 

III. AREAL DATA ANALYSIS

 

1. Overview of Areal Data Analysis


        1.1  Extensive versus Intensive Data Representations
        1.2  Spatial Pattern Analysis

        1.3  Spatial Regression Analysis

2. Modeling the Spatial Structure of Areal Units


        2.1  Spatial Weights Matrices

               2.1.1  Point Representations of Areal Units
               2.1.2  Spatial Weights based on Centroid Distances

               2.1.3  Spatial Weights based on Boundaries

               2.1.4  Combined Distance-Boundary Weights

               2.1.5  Normalizations of Spatial Weights

        2.2  Construction of Spatial Weights Matrices

               2.2.1  Construction of Spatial Weights based on Centroid Distances
               2.2.2  Construction of Spatial Weights based Boundaries

3. The Spatial Autoregressive Model


        3.1  Relation to Time Series Analysis

        3.2  The Simultaneity Property of Spatial Dependencies

        3.3  A Spatial Interpretation of Autoregressive Residuals        

               3.3.1  Eigenvalues and Eigenvectors of Spatial Weights Matrices
               3.3.2  Convergence Conditions in Terms of Rho

               3.3.3  A Steady-State Interpretations of Spatial Autoregressive Residuals

4. Testing for Spatial Autocorrelation


        4.1  Three Test Statistics

               4.1.1  Rho Statistic

               4.1.2  Correlation Statistic

               4.1.3  Moran Statistic

               4.1.4  Comparison of Statistics

        4.2  Asymptotic Moran Tests of Spatial Autocorrelation

                 4.2.1  Asymptotic Moran Test for Regression Residuals

               4.2.2  Asymptotic Moran Test in ARCMAP

        4.3  Random Permutation Test of Spatial Autocorrelation

               4.3.1  SAC-Perm Test

               4.3.2  Application to English Mortality Data

5. Tests of Spatial Concentration


        5.1  A Probabilistic Interpretation of G*

        5.2  Global Tests of Spatial Concentration

        5.3  Local Tests of Spatial Concentration

               5.3.1  Random Permutation Test

               5.3.2  English Mortality Example

               5.3.3  Asymptotic G* Test in ARCMAP

               5.3.4  Advantage of G* over G for Analyzing Spatial Concentration

6. Spatial Regression Models for Areal Data Analysis


        6.1  The Spatial Errors Model (SEM)

        6.2  The Spatial Lag Model (SLM)

               6.2.1  Simultaneity Structure

               6.2.2  Interpretation of Beta Coefficients

        6.3  Other Spatial Regression Models

               6.3.1  The Combined Model

               6.3.2  The Durbin Model

               6.3.3  The Conditional Autoregressive (CAR) Model

7. Spatial Regression Parameter Estimation


        7.1  The Method of Maximum-Likelihood Estimation

        7.2  Maximum-Likelihood Estimation for General Linear Regression Models

               7.2.1 Maximum-Likelihood Estimation for OLS

               7.2.2 Maximum-Likelihood Estimation for GLS

        7.3  Maximum-Likelihood Estimation for SEM

        7.4  Maximum-Likelihood Estimation for SLM

        7.5  An Application to the Irish Blood Group Data

               7.5.1 OLS Residual Analysis and Choice of Spatial Weights Matrices

               7.5.2 Spatial Regression Analyses

8. Parameter Significance Tests for Spatial Regression


        8.1  A Basic Example of Maximum Likelihood Estimation and Inference

               8.1.1 Sampling Distribution by Elementary Methods

               8.1.2 Sampling Distribution by General Maximum-Likelihood Methods  

        8.2  Sampling Distributions for General Linear Models with Known Covariance

               8.2.1 Sampling Distribution by Elementary Methods

               8.2.2 Sampling Distribution by General Maximum-Likelihood Methods  

        8.3  Asymptotic Sampling Distributions for the General Case

        8.4  Parameter Significance Tests for SEM

               8.4.1 Parametric Tests for SEM

               8.4.2 Application to the Irish Blood Group Data

        8.5  Parameter Significance Tests for SLM

               8.5.1 Parametric Tests for SLM

               8.5.2 Application to the Irish Blood Group Data

9. Goodness-of-Fit Measures for Spatial Regression


        9.1  The R-Squared Measure for OLS

               9.1.1 The Regression Dual

               9.1.2 Decomposition of Total Variation

               9.1.3 Adjusted R-Squared

        9.2  Extended R-Squared Measures for GLS

               9.2.1 Extended R-Squared for SEM

               9.2.2 Extended R-Squared for SLM

        9.3  The Squared Correlation Measure for GLS Models

               9.3.1 Squared Correlation for OLS

               9.3.2 Squared Correlation for SEM and SLM

               9.3.3 A Geometric View of Squared Correlation

10. Comparative Tests among Spatial Regression Models


        10.1  A One-Parameter Example

        10.2  Likelihood-Ratio Tests against OLS

        10.3  The Common-Factor Hypothesis              

        10.4  The Combined-Model Approach

 

APPENDIX TO PART III

         A3.1.  The Geometry of Linear Transformations

                   3.1.1 Nonsingular Transformations and Inverses

                   3.1.2 Orthonormal Transformations

           A3.2.  Singular Value Decomposition Theorem

                   3.2.1 Inverses and Pseudoinverses

                   3.2.2 Determinants and Volumes

                   3.2.3 Linear Transformations of Random Vectors

           A3.3.  Eigenvalues and Eigenvectors

           A3.4.  Spectral Decomposition Theorem

                   3.4.1 Eigenvalues and Eigenvectors of Symmetric Matrices

                   3.4.2 Some Consequences of SVD for Symmetric Matrices

                   3.4.3 Spectral Decomposition of Symmetric Positive Semidefinite Matrices

                   3.4.4 Spectral Decompositions with Distinct Eigenvalues

                   3.4.5 General Spectral Decomposition Theorem

        A3.5.  Nonnegative Matrices

                   3.5.1 Strongly Connected Matrices

                   3.5.2 Perron-Frobenius Theorem

                   3.5.3 Application to Spatial Autoregressive Kernels

                   3.5.4 Geometry of Complex Eigenvalues

        A3.6.  Geometry of Correlation in Regression

                   3.6.1 Deviation Space

                   3.6.2 Regression in Deviation Space

                   3.6.3 Application to Squared Correlation for OLS and GLS

        A3.7.  Large Sample Properties of Maximum Likelihood Estimators

                   3.7.1 Some Useful Preliminary Results

                   3.7.2 Consistency of Maximum Likelihood Estimators

                   3.7.3 Asymptotic Normality of Maximum Likelihood Estimators

   

IV. SOFTWARE  

     1.  ARCMAP

       1.1   Opening ARCMAP

       1.2  Tips for Using ARCMAP

              1.2.1    Importing Text Files to ARCMAP
              1.2.2    Changing Path Directories in Map Documents
              1.2.3    Making a Column of Row Numbers in an Attribute Table
              1.2.4    Masking in ARCMAP
              1.2.5    Making Spline Contours in Spatial Analyst
              1.2.6    Excluding Values from Map Displays
              1.2.7    Importing ARCMAP Images to the Web
              1.2.8    Adding Areas to Map Polygons
              1.2.9    Adding Centroids to Map Polygons

              1.2.10  Adding Coordinate Fields to Attributes of Point Shapefiles
              1.2.11  Converting Strings to Numbers in ARCMAP
              1.2.12  Displaying Proper Distance Units
              1.2.13  Editing Point Styles in ARCMAP
              1.2.14  Exporting Maps from ARCMAP to WORD
              1.2.15  Making Legends for Exported Maps
              1.2.16  Making Voronoi Tessellations in ARCMAP as Shapefiles
              1.2.17  Running Local G* Tests of Concentration in ARCMAP
              1.2.18  Joining Point Date to Polygon Shapefiles in ARCMAP
              1.2.19  Saving Map Documents with Relative Paths
              1.2.20  Increasing Unique Values for Editing Raster Outputs (in Version 9.3)

    2. JMP

       2.1  Opening JMP

       2.2  Tips for using JMP

              2.2.1   Printing Results from JMP
              2.2.2   Making a Random Reordering of Row Numbers  

    3. MATLAB

       3.1  Opening MATLAB

       3.2  Tips for using MATLAB

              3.2.1   Exporting Graphics from MATLAB to WORD
              3.2.2   Making Boundary-Share Weight Matrices in MATLAB
              3.2.3  
Making Boundary-Share Weight Matrices using ARCMAP and MATLAB
              3.2.4   Clipping Grids in ARCMAP for use in and MATLAB

              3.2.5   Exporting Data from MATLAB to ARCMAP

              3.2.6   Converting Boundary Shapefiles to MATLAB format.
                      

  REFERENCES


 


Last modified: July 18, 2021