Geospatial indexing app with different backends using Spring Boot and Testcontainers
— Geospatial, GeoJSON, Testcontainers, Redis, PostGIS, Spring Boot, Java — 5 min read
Summary: A REST API with different backend implementations to index and search geospatial data.
Story
As a developer, I want to index location coordinates so that I can later search for nearby locations within a certain distance from a specified point.
Acceptance Criteria
- REST API using GeoJSON as the data model.
- Easily switchable configuration on what backend to use.
If you want to go straight to the code, see the GitHub project here.
TDD Approach
We will be performing API level tests on a running instance of the application with the help of Testcontainers. We will run the same test on both implementations. The tests are ordered to first index all geometry types then perform location proximity tests among the indexed locations.
@ActiveProfiles("redis")@ContextConfiguration( classes = { GeoIndexApplication.class }, initializers = { ContainerUtils.RedisContainerInitializer.class })public class RedisIntegrationTest extends ApiIntegrationTest {
}
@ActiveProfiles("postgis")@ContextConfiguration( classes = { GeoIndexApplication.class }, initializers = { ContainerUtils.PostgisContainerInitializer.class })public class PostgisIntegrationTest extends ApiIntegrationTest {
}
Architecture
The GeoJSON format will be used for a standard data model. Aside from the geospatial data, an identifier and key are required during indexing. They will be used to identify and group the location. For example, we want to index Mt. Everest and Mt. Fiji on the mountains group.
The data model for this application is only focused on geospatial data. Other properties not useful for geospatial functions are abstracted away on the API client. For example, the name of the mountain doesn't have an impact on calculating its proximity to other mountains, so will not be saving them.
For a simpler implementation, we will only support Point, LineString, and Polygon geometric types.
For distance proximity queries, the default unit would be in meters for now.
API
We will have 2 endpoints to support the indexing and searching of locations.
POST v1/geo-indexes/{key}
- accepts geometry types in GeoJSON format.
GET v1/geo-indexes/{key}/radius
- returns the list of the id of the nearby location within the specified radius of a given latitude and longitude.
You can easily test them via Swagger UI.
Geospatial support
On this POC, we will be trying 2 different tools with geospatial support.
Using Redis
Redis has several commands related to geospatial indexing (GEO commands) but unlike other commands these commands lack their own data type. These commands actually piggy back on the sorted set datatype. This is achieved by encoding the latitude and longitude into the score of the sorted set using the geohash algorithm. https://redis.com/redis-best-practices/indexing-patterns/geospatial/
We will be using GEOADD and GEORADIUS.
Since Redis only allows indexing of latitude and longitude, we are going to extract all the points from the GeoJSON object and index them individually.
The key
on which you index the location should also be used when you are performing the location search.
If you index 10 mountain locations under the mountains
key, they won't show if you search for them on the rivers
key.
For a Point, only 1 coordinate is indexed. For a LineString and Polygon, all coordinates are indexed. We will follow the format below when assigning the member id (string).
<identifier>:<index>
So for example a river with identifier NILE containing 5 points will be indexed as:
NILE:0NILE:1NILE:2NILE:3NILE:4
However, when performing a location search, if one of the coordinates is part of the result. The suffix will be discarded and only the identifier is considered.
Libraries Used:
- Spring Data Redis
gt-geojson-core
to parse GeoJSON text
Using PostGIS
PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for geographic objects allowing location queries to be run in SQL.
We will be storing the geospatial data in geometries
table with 3 columns for key, identifier, and geometry.
The GeoJSON data will be stored in the geometry
column.
ST_DWithin
function will be used in performing the proximity query.
Libraries Used:
- Spring Data JPA
- Hibernate Spatial
gt-geojson-core
to parse GeoJSON text
Other tools with geospatial support
- ElasticSearch - https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-queries.html
- MySQL - https://dev.mysql.com/doc/refman/8.0/en/spatial-types.html
- S2 Library - https://s2geometry.io/
Local Development
With the help of Testcontainers, we will be able to run a standalone instance of the application along with the required running backend docker container. This would speed up our development or if you just want to check out the application.
We got this approach from this awesome blog by Sergei Egorov. Check it out!
Stateful
Specify the desired backend implementation using profile when running the GeoIndexApplication
then configure the connection on the application-*.yaml files based on your local environment after you run the required container (Redis or PostGIS).
Stateless
To run the standalone version of the application, just run the provided <Impl>GeoIndexApplication
class. This requires no additional configuration just ensure you have a Docker installed.
However, the data will be lost when the application is shut down. Verifying the data will also be challenging since the ports of the containers are randomly assigned.
Thank you for your time reading up until this point. I would love to hear your feedback and things to improve not only on the literature but on the code as well!
Originally posted on Medium Geospatial indexing app with different backends using Spring Boot and Testcontainers.
!!! UPDATE (16-03-2023) !!!
Using Elasticsearch
The
geo_shape
mapping maps GeoJSON or WKT geometry objects to thegeo_shape
type. Thegeo_distance
query finds documents with geoshapes or geopoints within the specified distance of a central point.
We will be storing a GeometryDocument with an id field and location. The location will be indexed in WKT format.
Libraries Used:
- Spring Data Elasticsearch
gt-geojson-core
to parse GeoJSON text
See git commits here