Plotting Data onto Geological Maps

This post shows how you can plot latitude and longitude (lat/long) data onto geological maps. At the end of this post, you'll be able to generate awesome-looking, interactive maps like this:

However, before we begin, be sure to install the following Python packages:

  • pyshp — for reading .shp formats.
  • folium — for plotting data on an interactive Leaflet map.

Installation:

  • pip install pyshp
  • pip install folium or conda install folium

0. Import Dependencies (As Always)

import shapefile  # "pip install pyship" 
import folium     # "pip install folium"
import glob
import pandas as pd
from collections import OrderedDict

1. Retrieve Dataset of MRT Exit Locations

Singapore's LTA Datamall provides geospatial datasets that are available in the Esri shapefile format. These datasets include a variety of geographical information, such as bus stops, MRT stations, MRT exits, and even lamp posts!

Look for the "Download All" link, download the ZIP file, and extract its contents.

We are only interested in the .shp and .dbf files in each dataset.

shp_files = glob.glob('./**/*.shp', recursive=True)
dbf_files = glob.glob('./**/*.dbf', recursive=True)

# Take a look at the top 8 .shp files
print(shp_files[:8])
['./GEOSPATIAL/ArrowMarking_May2017/ArrowMarking.shp',
 './GEOSPATIAL/Bollard_May2017/Bollard.shp',
 './GEOSPATIAL/BusStopLocation_May2017/BusStop.shp',
 './GEOSPATIAL/ControlBox_May2017/ControllerBox.shp',
 './GEOSPATIAL/ConvexMirror_May2017/ConvexMirror.shp',
 './GEOSPATIAL/CoveredLinkway_May2017/CoveredLinkWay.shp',
 './GEOSPATIAL/CyclingPath_May2017/CyclingPath.shp',
 './GEOSPATIAL/DetectorLoop_May2017/DetectorLoop.shp']

We are only interested in the dataset containing the locations of MRT Exits (there may be more than one exit per MRT Station).

query = 'TrainStationExit'
index = None
for i, (shp, dbf) in enumerate(zip(shp_files, dbf_files)):
    if query.lower() in shp.lower():
        index = i
        break
print('The query \"{0}\" is at index {1}'.format(query, index))

Output:

The query "TrainStationExit" is at index 28

Once we know the index of the dataset that we're looking for (i.e. train station exits), we'll read its .shp and .dbf files.

myshp = open(shp_files[index], 'rb')
mydbf = open(dbf_files[index], 'rb')
sf = shapefile.Reader(shp=myshp, dbf=mydbf)
records = sf.shapeRecords()

2. Inspect the Records

# Count the number of shape objects
print('There are {} shape objects'.format(len(records)))
There are 474 shape objects

Take a look at the top 5 items in the records list.

for x in records[:5]:
    msg = '{0:<23} ({1:<1})  @  {2}'
    print(msg.format(x.record[1], x.record[2], x.shape.points[0]))
OUTRAM PARK MRT STATION (C)  @  [28674.007886500644, 29288.625595194593]
OUTRAM PARK MRT STATION (D)  @  [28678.80760775766, 29311.492819941253]
OUTRAM PARK MRT STATION (E)  @  [28720.131027685264, 29271.248972275072]
DOVER MRT STATION       (A)  @  [21877.47712613484, 32635.24066091724]
DOVER MRT STATION       (B)  @  [21895.290533801097, 32665.953436625015]

Convert records into a Pandas DataFrame and take a look at the top 5 rows.

# Retrieve the "STN_NAME", "EXIT_CODE", and the coordinates
list_of_tuples = [(x.record[1], x.record[2], x.shape.points[0]) for x in records]

# Unzip list of tuples into individual lists
stn_names, exit_codes, coordinates = zip(*list_of_tuples)

d = OrderedDict([('stn_names', stn_names), 
                 ('exit_codes', exit_codes), 
                 ('coordinates', coordinates)])

df = pd.DataFrame(d)
df.head()
stn_names exit_codes coordinates
0 OUTRAM PARK MRT STATION C [28674.007886500644, 29288.625595194593]
1 OUTRAM PARK MRT STATION D [28678.80760775766, 29311.492819941253]
2 OUTRAM PARK MRT STATION E [28720.131027685264, 29271.248972275072]
3 DOVER MRT STATION A [21877.47712613484, 32635.24066091724]
4 DOVER MRT STATION B [21895.290533801097, 32665.953436625015]

3. Fixing the Coordinates

If you're observant, you'll notice that the coordinates look weird — and that's because Singapore has a special coordinate system called SVY21.

Thankfully, cgcai had already wrote an open source tool to convert these coordinates to the standard lat/long that we are all familiar with.

For convenience, here's the SVY21 Python code.

Bonus: If you're using Jupyter Notebook, simply run the following code in a Juypyter Notebook cell and it'll automatically fetch the SVY21.py script.

%load "https://gist.githubusercontent.com/jovianlin/30b0fd93e2b835cb1e8d93bfa0e1e62f/raw/89548360fb54ab1e0441b0935fc52fbbac6d263e/SVY21.py"

Now, let's fix the coordinates. We'll add a new column new_coordinates to contain the corrected lat/long.

svy = SVY21()
df['new_coordinates'] = df.coordinates.apply(lambda x: svy.computeLatLon(x[1], x[0]))
df.head()
stn_names exit_codes coordinates new_coordinates
0 OUTRAM PARK MRT STATION C [28674.007886500644, 29288.625595194593] (1.2811497583405485, 103.83937446565562)
1 OUTRAM PARK MRT STATION D [28678.80760775766, 29311.492819941253] (1.2813565613992561, 103.8394175934818)
2 OUTRAM PARK MRT STATION E [28720.131027685264, 29271.248972275072] (1.2809926092393662, 103.83978889939382)
3 DOVER MRT STATION A [21877.47712613484, 32635.24066091724] (1.3114147664293516, 103.77830438221726)
4 DOVER MRT STATION B [21895.290533801097, 32665.953436625015] (1.3116925253448966, 103.7784644383023)

4. Map Exit Codes to a Range of Colors

To make the task more exciting, we'll map each unique exit code to a unique color. For that, we'll need to define a color range, which fortunately, has been addressed in a previous post.

from colour import Color
def get_color_range(n, output_type='hex'):
    red = Color('red')
    blue = Color('blue')
    color_range = list(red.range_to(blue, n))
    if output_type == 'hex':
        return [c.get_hex_l() for c in color_range]
    else:
        return [c.get_rgb() for c in color_range]
        
# List all unique exit codes
unique_exit_codes = list(df.exit_codes.unique())
print(unique_exit_codes)
print()
# Generate a range of colors
color_range = get_color_range(len(unique_exit_codes))
print(color_range)
['C', 'D', 'E', 'A', 'B', 'F', 'G', 'I', 'H', 'J', 'NULL', 'A1', 'A2']

['#ff0000', '#ff5500', '#ffaa00', '#ffff00', '#aaff00', '#55ff00', '#00ff00', '#00ff55', '#00ffaa', '#00ffff', '#00aaff', '#0055ff', '#0000ff']

Map exit_code to color_range with a dictionary that we call code2color.

code2color = {exit_code:c for exit_code, c in zip(unique_exit_codes, color_range)}
print(code2color)
{'A': '#ffff00',
 'A1': '#0055ff',
 'A2': '#0000ff',
 'B': '#aaff00',
 'C': '#ff0000',
 'D': '#ff5500',
 'E': '#ffaa00',
 'F': '#55ff00',
 'G': '#00ff00',
 'H': '#00ffaa',
 'I': '#00ff55',
 'J': '#00ffff',
 'NULL': '#00aaff'}

5. Finally, the Moment We've All Been Waiting For

sg_map = folium.Map(location=[1.38, 103.8], zoom_start=11)

for _, row in df.iterrows():
    
    stn_name = row.stn_names
    exit_code = row.exit_codes
    lat, long = row.new_coordinates
    
    folium.RegularPolygonMarker([lat, long],
        popup='{} / Exit: {}'.format(stn_name, exit_code),
        fill_color=code2color[exit_code],
        number_of_sides=20,
        radius=8
        ).add_to(sg_map)

sg_map

Click here to open the map in a new window/tab.


Other Resources


If you enjoyed this post and want to buy me a cup of coffee...

The thing is, I'll always accept a cup of coffee. So feel free to buy me one.

Cheers! ☕️