Permutation tests

I answered a question about spatial pattern testing on the R-sig-geo email list. Designing your own permutation test is (can be?) fun and easy, and I thought I’d save it here for future reference.

Why permutation tests? You can get at exactly what you want to know, which can be especially useful with complex spatial questions.

The question: which are closer on average to points in A, points in B or points in C?

Null hypothesis: mean distance from B to A is equal to mean distance from C to A.

I’ve adapted the example slightly from the original question on the mailing list.

library(spatstat)

set.seed(2019)

A <- rpoispp(100) ## Base set
B <- rpoispp(50) ## First set of points
C <- rpoispp(50) ## Second set of points

ABd <- crossdist(A, B)
ACd <- crossdist(A, C)

mean(ABd)
# 0.5168865
mean(ACd)
# 0.5070118

Are the two distances different?

One way to approach this with a permutation test is to keep the points in B and C where they are, but shuffle the assignment to B or C. That formulation might be appropriate if you're working with trees, where the trees are where they are but you're concerned with whether species B is closer to species A than another species is. (There are multiple ways to approach this question, but I like permutation tests.)

The basic procedure, then, is to reorder the point labels a bunch of times, and take the mean resampled distances. Then compare the true difference in distances to the permuted difference in distances to see where the true distance falls in the overall distribution.

The thing about using this method is that it assumes that the trees don't move, and so takes into account the intrinsic structure of the tree spatial pattern.

nperm <- 999
permout <- data.frame(ABd = rep(NA, nperm), ACd = rep(NA, nperm))

# create framework for a random assignment of B and C to the existing points

BC <- superimpose(B, C)
B.len <- npoints(B)
C.len <- npoints(C)
B.sampvect <- c(rep(TRUE, B.len), rep(FALSE, C.len))

set.seed(2019)
for(i in seq_len(nperm)) {
B.sampvect <- sample(B.sampvect)
B.perm <- BC[B.sampvect]
C.perm <- BC[!B.sampvect]

permout[i, ] <- c(mean(crossdist(A, B.perm)), mean(crossdist(A, C.perm)))
}

boxplot(permout$ABd - permout$ACd)
points(1, mean(ABd) - mean(ACd), col="red")

table(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd))
# FALSE TRUE
# 573 426

sum(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd)) / nperm
# 0.4264264

The difference between ACd and ABd is indistinguishable from that obtained by a random resampling of B and C.

Or, there is no apparent difference in the mean distance of B to A or C to A.

Which is reassuring, since A, B, and C were all randomly generated.

Stuck on rgdal

Update: Thanks to Colin Rundel figuring out the problem and Roger Bivand implementing it, the devel version of rgdal (revision 758) on R-Forge installs beautifully.


I updated all of my linux computers to Fedora 28, and now can’t install rgdal.

I can install sf with no problems, so it isn’t an issue with GDAL or with proj.

I checked with Roger Bivand, the package maintainer, who asked me to confirm whether I could install sf, to make sure it wasn’t a dependency issue, and to post the logs online, then ask on the R-sig-geo mailing list. I’m putting the full problem here, and the abbreviated version on the mailing list.

I’d appreciate any thoughts: I rely very heavily on the incredibly useful rgdal package.


Fedora packages installed (fc28):

gdal 2.2.4-2
proj 4.9.6-3
gcc 8.1.1-1


R information:


> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 28 (Workstation Edition)

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.5.0 tools_3.5.0


The problem appears to be with the C++ configuration options, but is beyond my ability to figure out.

The install message is:

* installing *source* package ‘rgdal’ ...
** package ‘rgdal’ successfully unpacked and MD5 sums checked
configure: CC: gcc -m64
configure: CXX: g++ -m64
configure: rgdal: 1.3-2
checking for /usr/bin/svnversion... no
configure: svn revision: 755
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... configure: error: in
/tmp/RtmpfMGBY5/R.INSTALL66c3b8deddb/rgdal':
configure: error: cannot run C++ compiled programs.
If you meant to cross compile, use
--host'.
See config.log' for more details
ERROR: configuration failed for package ‘rgdal’
* removing ‘/usr/lib64/R/library/rgdal’
* restoring previous ‘/usr/lib64/R/library/rgdal’


And here's config.log:

This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.

It was created by rgdal configure 1.3-2, which was
generated by GNU Autoconf 2.69. Invocation command line was

$ ./configure

## --------- ##
## Platform. ##
## --------- ##

hostname = scgwork
uname -m = x86_64
uname -r = 4.16.12-300.fc28.x86_64
uname -s = Linux
uname -v = #1 SMP Fri May 25 21:13:28 UTC 2018

/usr/bin/uname -p = x86_64
/bin/uname -X = unknown

/bin/arch = x86_64
/usr/bin/arch -k = unknown
/usr/convex/getsysinfo = unknown
/usr/bin/hostinfo = unknown
/bin/machine = unknown
/usr/bin/oslevel = unknown
/bin/universe = unknown

PATH: /usr/lib64/qt-3.3/bin
PATH: /usr/share/Modules/bin
PATH: /usr/local/bin
PATH: /usr/local/sbin
PATH: /usr/bin
PATH: /usr/sbin
PATH: /home/sarahg/.bin

## ----------- ##
## Core tests. ##
## ----------- ##

configure:1773: CC: gcc -m64
configure:1775: CXX: g++ -m64
configure:1778: rgdal: 1.3-2
configure:1781: checking for /usr/bin/svnversion
configure:1794: result: yes
configure:1809: svn revision: 755
configure:1988: checking for C++ compiler version
configure:1997: g++ -m64 --version >&5
g++ (GCC) 8.1.1 20180502 (Red Hat 8.1.1-1)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

configure:2008: $? = 0
configure:1997: g++ -m64 -v >&5
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,fortran,objc,obj-c++,ada,go,lto --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-gcc-major-version-only
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array
--with-isl --enable-libmpx --enable-offload-targets=nvptx-none
--without-cuda-driver --enable-gnu-indirect-function --enable-cet
--with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 8.1.1 20180502 (Red Hat 8.1.1-1) (GCC)
configure:2008: $? = 0
configure:1997: g++ -m64 -V >&5
g++: error: unrecognized command line option '-V'
g++: fatal error: no input files
compilation terminated.
configure:2008: $? = 1
configure:1997: g++ -m64 -qversion >&5
g++: error: unrecognized command line option '-qversion'; did you mean
'--version'?
g++: fatal error: no input files
compilation terminated.
configure:2008: $? = 1
configure:2028: checking whether the C++ compiler works
configure:2050: g++ -m64 -I/usr/local/include -Wl,-z,relro -Wl,-z,now
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld conftest.cpp >&5
configure:2054: $? = 0
configure:2102: result: yes
configure:2105: checking for C++ compiler default output file name
configure:2107: result: a.out
configure:2113: checking for suffix of executables
configure:2120: g++ -m64 -o conftest -I/usr/local/include
-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
conftest.cpp >&5
configure:2124: $? = 0
configure:2146: result:
configure:2168: checking whether we are cross compiling
configure:2176: g++ -m64 -o conftest -I/usr/local/include
-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
conftest.cpp >&5
/usr/bin/ld: /tmp/cc9pfZ1b.o: relocation R_X86_64_32 against
.rodata'
can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
configure:2180: $? = 1
configure:2187: ./conftest
./configure: line 2189: ./conftest: No such file or directory
configure:2191: $? = 127
configure:2198: error: in /home/sarahg/Downloads/rgdal':
configure:2200: error: cannot run C++ compiled programs.
If you meant to cross compile, use
--host'.
See `config.log' for more details

## ---------------- ##
## Cache variables. ##
## ---------------- ##

ac_cv_env_CCC_set=
ac_cv_env_CCC_value=
ac_cv_env_CPPFLAGS_set=
ac_cv_env_CPPFLAGS_value=
ac_cv_env_CXXFLAGS_set=
ac_cv_env_CXXFLAGS_value=
ac_cv_env_CXX_set=
ac_cv_env_CXX_value=
ac_cv_env_LDFLAGS_set=
ac_cv_env_LDFLAGS_value=
ac_cv_env_LIBS_set=
ac_cv_env_LIBS_value=
ac_cv_env_build_alias_set=
ac_cv_env_build_alias_value=
ac_cv_env_host_alias_set=
ac_cv_env_host_alias_value=
ac_cv_env_target_alias_set=
ac_cv_env_target_alias_value=
ac_cv_file__usr_bin_svnversion=yes

## ----------------- ##
## Output variables. ##
## ----------------- ##

CPPFLAGS='-I/usr/local/include'
CXX='g++ -m64'
CXXFLAGS=''
DEFS=''
ECHO_C=''
ECHO_N='-n'
ECHO_T=''
EXEEXT=''
GDAL_CONFIG=''
HAVE_CXX11=''
LDFLAGS='-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld'
LIBOBJS=''
LIBS=''
LTLIBOBJS=''
OBJEXT=''
PACKAGE_BUGREPORT='Roger.Bivand@nhh.no'
PACKAGE_NAME='rgdal'
PACKAGE_STRING='rgdal 1.3-2'
PACKAGE_TARNAME='rgdal'
PACKAGE_URL=''
PACKAGE_VERSION='1.3-2'
PATH_SEPARATOR=':'
PKG_CPPFLAGS=''
PKG_LIBS=''
SHELL='/bin/sh'
ac_ct_CXX=''
bindir='${exec_prefix}/bin'
build_alias=''
datadir='${datarootdir}'
datarootdir='${prefix}/share'
docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
dvidir='${docdir}'
exec_prefix='NONE'
host_alias=''
htmldir='${docdir}'
includedir='${prefix}/include'
infodir='${datarootdir}/info'
libdir='${exec_prefix}/lib'
libexecdir='${exec_prefix}/libexec'
localedir='${datarootdir}/locale'
localstatedir='${prefix}/var'
mandir='${datarootdir}/man'
oldincludedir='/usr/include'
pdfdir='${docdir}'
prefix='NONE'
program_transform_name='s,x,x,'
psdir='${docdir}'
sbindir='${exec_prefix}/sbin'
sharedstatedir='${prefix}/com'
sysconfdir='${prefix}/etc'
target_alias=''

## ----------- ##
## confdefs.h. ##
## ----------- ##

/* confdefs.h */
#define PACKAGE_NAME "rgdal"
#define PACKAGE_TARNAME "rgdal"
#define PACKAGE_VERSION "1.3-2"
#define PACKAGE_STRING "rgdal 1.3-2"
#define PACKAGE_BUGREPORT "Roger.Bivand@nhh.no"
#define PACKAGE_URL ""

configure: exit 1