Spatial Data Science with PostgreSQL/PostGIS
A practical guide to getting started with PostgreSQL and Python for Spatial Data.
Spatial is not special! If you ever had tried using spatial data with plain SQL, dealing geometries might not be straightforward. However, with Spatial SQL, it feels like dealing just another column in your table.
In this guide, we will go through how to set up and getting started with Spatial SQL. We will install g PostgreSQL, activate PostGIS extension and perform some simple Spatial SQL operations. Later in the article, we will learn how to incorporate Spatial SQL into your Python workflows.
Spatial SQL (PostGIS)
Spatial SQL is processing, leveraging and performing spatial operations on spatial data. A Spatial Database is optimized for storing, manipulating and querying geographic data with defined geometries.
Although there are multiple spatial database providers out there, I usually prefer PostgresSQL/PostGIS. PostGIS is also widely used standing in the Geospatial database world — For example, Big Querry adopts it.
If you are new to the world of Database, you might ask the difference between PostgreSQL and PostGIS. We can think of them this way: Postgresql is the DatabaseDatabase while PostGIS is an extension that provides spatial functionality on top of PostgreSQL.
Installing PostgreSQL + PostGIS
Installing PostgreSQL is straightforward. You need to go ahead and select the Binary package for your distribution: Windows, macOS or Linux.
The core of the PostgreSQL object-relational database management system is available in several source and binary…www.postgresql.org
Once you download Download the Interactive installer by EDB, run the .exe file or follow the instructions given for your distribution. I have included (GIF instructions) here how to add PostGIS extension to your PostgreSQL. We are also installing PgAdmin4 as GUI interface for PostgreSQL.
Make sure you check the PgAdmin4 and Stackbuilder while following the interactive installation guide. This step will allow us later to use the PG Admin 4 User interface. Make sure also to check the PostGIS extension, which enables us to perform spatial operations.
There is a guide for the Interactive installer if you want step by step guide for your operating system.
The PostgreSQL installers created by EnterpriseDB are designed to make it quick and simple to install PostgreSQL on…www.enterprisedb.com
PgAdmin 4 GUI Interface
Let us get familiar with PgAdmin GUI interface which we will use in this tutorial. The following GIF shows the interface. We are also creating a PostGIS extension, which brings forth the spatial functions we will use later.
We can simply create a PostGIS extension with the following command implementing in PgAdmin Query tool.
CREATE EXTENSION postgis;
After running this command, we have plenty of new spatial functions available to our disposal.
Let us also create a database using PgAdmin4. The following GIF instructions show you how to create a new database.
Loading Geospatial data is different than loading CSV files into PostgreSQL. We need to preserve the Geometry column. There is, however, an easy way to upload spatial data using Shapefile Loader.
The following GIF instructions show how to load shapefiles to a PostgreSQL database.
Before uploading the data, you need to find out what is the Coordinate reference system (CRS) of the data. In this case, I already looked at the data with the help of QGIS, and I knew the projection code. But if you are not unsure, you can input 4326 which is the commonly used WGS 84 code.
After we load the data, we are set up to perform spatial data analysis with PostGIS. I also upload some other files into the DatabaseDatabase: streets, boundaries and London underground stations. We will be using them as
Basic Spatial Operations
Spatial SQL can make processing and analyzing geospatial data easy and can speed up also specific processes. There are hundreds of spatial functions readily available at your disposal.
Let us start with some simple examples. In a later post, I will delve deeper into Spatial SQL functions and processes. I have uploaded s
Example 1: Calculate the Area of buildings
SELECT objectid, shapestare, ROUND(ST_Area(geom)::numeric,2) AS AREA FROM public.buildings
This simple SQL code returns all the buildings with new area column with two decimal precision. The Total query runtime: 659 msec. With 176023 rows affected.
We are using here the
ST_Area() PostGIS function.
Example 2: Calculate the Length of each Street type
SELECT streettype, Sum(ST_Length(geom)) AS length FROM public.streets GROUP BY 1 ORDER BY 2 DESC;
The above SQL code groups each street type and calculates the length of each type. We are using here the
ST_Length() PostGIS function.
Access PostGIS with Python
The good thing is that you can connect your PostGIS database easily in Python. Geopandas, which is the most used geospatial python library, can ingest and read SQL statements from PostGIS. Here is how you can connect your DatabaseDatabase and run Spatial SQL statements in Python.
import geopandas as gpd import sqlalchemy as db
# Create a database connection engine = db.create_engine(‘postgresql+psycopg2://postgres:postgres@localhost:5432/postgis_database’) con = engine.connect()
# Read PostGIS database with Geopandas. sql = “SELECT * FROM public.streets;” data = gpd.read_postgis(sql=sql, con=con)
Bringing your PostGIS database into Python opens up other possibilities to explore the data and visualize it. Let us visualize the streets dataset we have read using Geopandas and Contextily base maps.
import geopandas as gpd import matplotlib.pyplot as plt import contextily as ctx
fig, ax = plt.subplots(figsize=(18, 16)) data.to_crs(epsg=3857).plot(ax = ax, figsize=(20,18), color=”gray”, alpha=0.5, ); ctx.add_basemap(ax, url=ctx.providers.OpenStreetMap.Mapnik);
In this guide, we have installed PostgreSQL and activated the PostGIS extension. We have seen how to create a database using PgAdmin and also upload spatial data into the DatabaseDatabase. We have also seen some simple examples of Spatial SQL, and finally, we have integrated the spatial DatabaseDatabase into Python.
I am planning to write spatial data processing with Spatial SQL articles next. So stay tuned.