What is new in Geopandas 0.70?
Major changes and new improvements with examples and code illustrations.
Geopandas is the workhorse of doing Geospatial data science in Python and extends the datatypes of Pandas to perform spatial data operations. Geopandas 0.70 has just been released yesterday 17 February and with it comes some significant changes and improvements.
I highlight here some of the best new features available with Geopandas 0.70.
Native Clip functionality
Clipping has become easy with a native function to clip a GeoDataFrame to the spatial extent of other shapes. Clipping is one of the most common used Geospatial data processing functionality; however, in previous releases, Geopandas did not have a straight forward function to perform it. If you want a specific area of your geographic data, you have to perform clipping to get your area of interest.
Now with a new
geopandas.clip function, you can easily clip your data with the spatial extent provided. Let us see an example.
import geopandas as gpd import matplotlib.pyplot as plt from shapely.geometry import Polygon, LineString gpd.__version__
We use Geopandas available datasets, capital cities, the world boundaries. Also, we subset Africa and also create a polygon to mark spatial extent we are interested.
capital_cities = gpd.read_file(gpd.datasets.get_path(“naturalearth_cities”))
world = gpd.read_file(gpd.datasets.get_path(“naturalearth_lowres”))
africa = world[world[“continent”] == “Africa”]
poly = Polygon([(-35, 0), (-35, 60), (60, 60), (60, 0), (0, 0)])
polygon = geopandas.GeoDataFrame(, geometry=[poly], crs=world.crs)
Let us see all the data in one image.
fig, ax = plt.subplots(figsize=(12,10))
africa.plot(ax=ax, edgecolor=”black”, color = “brown”, alpha=0.5)
The capital cities are marked as red dots, and Africa is shown as green. The Red Rectangle shows the extent we are interested in.
Now if we want to clip only the part of the red rectangle from the world boundaries, you can call
geopandas.clip and provide the two GeoDataFrames. First, the
world world boundaries and the
clipped = gpd.clip(world, polygon)
fig, ax = plt.subplots(figsize=(12,10))
And You have got the clipped areas, parts of Africa, Europe and Asia as shown below.
You can use this function not only polygons and polygons but other data types like Lines and points.
Filter Rows while reading files
With current big datasets, the ability to filter rows while reading the data is essential. Imagine having a dataset with millions of rows and not being able to read due to memory issues. With this new functionality, you can provide rows or slices to filter out the data before reading it.
Let us see an example.
cities_filtered = gpd.read_file(gpd.datasets.get_path(“naturalearth_cities”),rows=30)
This will only read the first 30 rows of the data. If you want to slice it in the middle, you can add slice.
cities_filtered_slice = gpd.read_file(gpd.datasets.get_path(“naturalearth_cities”),rows=slice(10,30))
Only, rows between 10 and 30 are returned with the above code.
Plot Geometry Collection
It was not possible to plot GeometryCollection data, for example, data with different data types (Points, Lines or Polygons). With the current release, you can plot various collections of geometric objects. Let us see an example.
a = LineString([(0, 0), (1, 1), (1,2), (2,2)])
b = LineString([(0, 0), (1, 1), (2,1), (2,2)])
x = a.intersection(b)
gc = gpd.GeoSeries(x)
b are just plain LineString, but their intersection returns a collection of Geometry (Lines and Points). If you create a
Geoseries out of the collection of Geometries in x, you can now plot it with Geopandas. See the example shown below. You have both a Point and a Line plotted in the same Geodataframe plotted.
PROJ 6 replaces PROJ 4
Geopandas 0.70 release starts using a new projection interface. PROJ 6 replaces PROJ4 and with it brings a better interface and additional information.
gpd.crs returned only a string like this.
However, the current release brings a lot of metadata on Geographic Projections of the data.
returns the following useful information
<Geographic 2D CRS: EPSG:4326> Name: WGS 84 Axis Info [ellipsoidal]: — Lat[north]: Geodetic latitude (degree) — Lon[east]: Geodetic longitude (degree) Area of Use: — name: World — bounds: (-180.0, -90.0, 180.0, 90.0) Datum: World Geodetic System 1984 — Ellipsoid: WGS 84 — Prime Meridian: Greenwich
Which you can also access individually, for example, if you want to access the datum of your projection, you can call.
And the result shows this part only
DATUM[“World Geodetic System 1984”, ELLIPSOID[“WGS 84”,6378137,298.257223563, LENGTHUNIT[“metre”,1]], ID[“EPSG”,6326]]
There also other improvements or bug fixes not covered in this article but worth mentioning them. Spatial join in Geopandas can now handle Multindex correctly and preserves the index name of the left Geodataframe. When writing to a file to disk, you can now keep the index if you want. Plotting choropleth maps with missing data is now available with this release.
Thanks to all contributors of this current release. Geopandas 0.70 brings a lot of improvements in the Geospatial data science processes. Clipping, filtering data while reading and plot multi geometry data is now possible thanks to this release. Proj 6 also brings a better user interface with one of the least understood themes in the Geospatial world, Geographic projections.
The code for this article is available in this Google Colab link.