telemetR: Extending with PostGIS- 11 mins
The second post in the telemetR series, extending the database with PostGIS .
In the last post we laid the foundation for an animal telemetry data management system. In this post we will extend the functionality of the PostgreSQL using the spatial extension PostGIS. PostGIS is an extenstion to the PostgreSQL that allows GIS objects to be stored in the database. We will be able to us PostGIS to store locations as geographic or geometric coordinates and perform many geographic calculations, that would other wise be implemented in R or a GIS software, in postgres.
- Install PostGIS
- Exploring Spatial Functions
- Visualize Spatial Data
- Use PostGIS in the Data Flow
- QA/QC with Spatial Attributes
- Wrap Up
- telemetR Series
On a Mac the easiest way to install PostGIS is to use the Postgres.app application. This comes packaged with PostGIS. I installed PostgeSQL and PostGIS with homebrew. The nice thing about homebrew is that it’ll install or update all the other dependancies as well. During my first attempt I needed to update my version of postgres to 9.6.1. Once each of them is installed create the PostGIS extension in the database you’re working on.
Exploring Spatial Functions
Before applying spatial utilities to our current data, lets explore some of the spatial functions. All SQL function that you should enter will be prefaced with
=#, anything that is returned by the database will have no leading characters.
This point doesn’t look much like a point. We can change how the point is represented.
ST_AsEWKT returns the Well-Known Text representation of the geometry.
We can set the coordinate reference system (CRS) to the point with
If we don’t want the CRS metadata displayed use
To project the point in a different reference system:
GPS to Spatial Objects
With these basic function from PostGIS we can start using these tools in our telemetry database. First we will create a new column,
geom to store the spatial objects, then add data to that column.
Check the schema of the
telemetry table. There is new column called
geom that is of type
geometry(Point, 4326). Run the SELECT query to see the data. Note that
ST_AsEWKT is required in order to view the actual points.
Now we can play with the data a little bit. First, we will check how many points each animal has in the
telemtry table. I’ve also counted the number in the
geom field to see how many blank records there are. In this case there are zero. It is possible to have records that don’t have coordinates. This can happen when the GPS device is supposed to send a record but doesn’t get a fix, or the point is an error and the coordinates have been deleted. There can be many records with no points depending on the terrain and canopy cover.
In the second SELECT statement we calculate the centroid of the points with
ST_Centroid. The centroid is geometric center of all the points and is calculated by taking the mean of longitude and latitude.<p>
Visualize Spatial Data
There are several methods for visualizing spatial data. You can use a dedicated GIS system like QGIS or ESRI products (open and closed source respectively), R, a web service like geojson.io, or even GitHub. About a 1.5 years ago GitHub gists introduced mapping capabilities when uploading a GeoJSON. The command below will create a GeoJSON object of all the points for each individual. The GeoJSON portion can be copied (
geo_json field) and pasted into a gist.
This is the GeoJSON object for
animal_id, which produces this map.
Pretty cool. We are definitely making progress with the animal movement database.
It is important to remember that the data we receive from the satellite is a set of spatial-temporal points. From this set of data we can reconstruct a very rough estimate of the animal’s actual movement path, the trajectory. We never get a complete record of the animals movement. The resolution of the path depends on the intervals on the collar as well as the biology of the animal.
In the last step we displayed all the collected points of the animal. In this step we will create a trajectory and display that trajectory on a map.
The result is a very poorly scaled map. These two animals occur on opposite sides of Nevada. In order to see their trajectories zoom into Lake Tahoe on the West side of Nevada, it is just West of the Minden map marker. The second trajector is on the East side of the state North of White Rock Range Wildlerness Northeast of the Pioche map marker. (sorry for the choice of animals).
A home range is defined as the area that an individual spends a majority of its time searching for resources (food, water, mates, etc.). The utilization distribution is a probability density function that predicts the probability of finding an animal in a given area. The home range and utilization distribution generally associated with one another in movement ecology. There are several methods to estimate the home range, for now we will estimate the minimum convex polygon [^1] from each animal in the database. In a future post we will use R to estimate home ranges with other methods.
Each animal’s MCP home range estimation.
Use PostGIS in the Data Flow
Recall from the last post that we automated the flow of GPS data from
raw_gps to the
telemetry table with triggers. We can write triggers automate the creation of spatial objects in the database and help with data QA/QC.
Instead of creating the
geom field by hand every time we upload new animals lets create a trigger that automates this process.
Now lets test it out by deleting one of the animals already in the
telemetry table then upload the data for that animal again. This will cause the trigger to insert data from the
raw_gps table into the
telemetry and the trigger to insert
telemetry.latitude into the
telemetry.geom as a point.
QA/QC with Spatial Attributes
GPS telemetry data can have a number of different errors for a number of different. Sometimes the GPS sensor records an erroneous point, or no point at all. The data can be incomplete (missing locations, but no timestamp) or completely missing (was scheduled to get a point but didn’t). This can depend on the biology of the animal, terrain, and canopy cover.
Now that we have extended the database with PostGIS we can use the functionality to detect outliers and erroneous points. Often these points are biologically impossible distances of velocities between two
In this post we extended the telemetry database with PostGIS. The database has most of the major functionality we need. The new GPS data is entered into the
raw_gps table, then automatically inserted into the
telemetry table. The spatial column
telemtry.geom is automatically when new records are added. There are many spatial functionalities that can be added to the database. We will get to that in a later post.
In the next post we will connect to the database with R for processing raw data files from the vendors, creating and inserting records into the datbase, and some basic movement analysis.
- Creating an Animal Movement Database
- Extending the Database with PostGIS
- Importing New Data
- QA/QC with Spatial Attributes
- Connecting to the Database with R
- Adding More SQL Functionality
- Shiny Web Application
- A Simple RESTful API
- … more tbd