Working with Geospatial Data¶
This tutorial covers UK Power Networks geospatial and infrastructure data:
- Introduction to geospatial/infrastructure data
- Using the geo orchestrator
- Listing available datasets
- Getting primary substations data
- Getting secondary sites data
- Getting overhead lines (HV, LV)
- Getting poles (HV, LV)
- Exporting to GeoJSON format
Prerequisites:
- Complete 01-getting-started.ipynb first
- Have your
UKPN_API_KEYenvironment variable set - These tutorials require additional dependencies. Install them with
pip install "ukpyn[all]"— see Tutorial 01 for full setup instructions
1. Introduction to Geospatial/Infrastructure Data¶
UK Power Networks provides detailed geospatial data about their distribution network infrastructure. This includes:
Primary Substations¶
Primary substations are major transformation points that step down voltage from the transmission network (132kV/33kV) to distribution voltages (11kV). These substations serve large geographical areas and are critical nodes in the distribution network.
Secondary Sites¶
Secondary sites (also known as distribution substations) further step down voltage from 11kV to 400V for delivery to end customers. There are thousands of these sites across the UKPN network area.
Overhead Lines¶
The overhead line network carries electricity between substations and to customers:
- HV (High Voltage): Typically 11kV lines connecting primary to secondary substations
- LV (Low Voltage): 400V/230V lines delivering power to end customers
Poles¶
Poles support the overhead line network:
- HV Poles: Support 11kV overhead lines
- LV Poles: Support low voltage overhead lines
UKPN Licence Areas¶
UK Power Networks operates across three licence areas:
- EPN (Eastern Power Networks) - East of England
- LPN (London Power Networks) - Greater London
- SPN (South Eastern Power Networks) - South East England
import ukpyn
ukpyn.check_api_key()
print("API key configured!")
2. Using the Geo Orchestrator¶
The geo orchestrator provides a simple interface to access all geospatial datasets. It handles API authentication and provides convenient functions for common tasks.
Import it directly from the ukpyn package:
from ukpyn import gis
print("Geo orchestrator loaded successfully!")
# We have a convenient way to check that the orchestrator is working, by printing the methods on the object.
print(repr(gis))
3. Listing Available Datasets¶
The geo orchestrator provides access to several infrastructure datasets. You can see all available datasets using geo.available_datasets:
# List all available geo datasets
print("Available geospatial datasets:")
print("-" * 40)
for dataset_name in gis.available_datasets:
print(f" - {dataset_name}")
# Expected output:
# Available geospatial datasets:
# ----------------------------------------
# - primary_substations
# - primary_areas
# - secondary_sites
# - hv_overhead_lines
# - lv_overhead_lines
# - hv_poles
# - lv_poles
Dataset Descriptions¶
| Dataset Name | Description |
|---|---|
primary_substations |
Primary substation areas with postcode coverage |
primary_areas |
Alias for primary_substations |
secondary_sites |
Secondary (distribution) substation locations |
hv_overhead_lines |
High voltage overhead line routes (11kV) |
lv_overhead_lines |
Low voltage overhead line routes (400V) |
hv_poles |
High voltage pole locations |
lv_poles |
Low voltage pole locations |
4. Getting Primary Substations Data¶
Primary substations are the main transformation points from the transmission network. Use geo.get_primary_substations() to retrieve this data.
You can optionally filter by licence area using the licence_area parameter.
# Get primary substations (all licence areas)
primary_substations = gis.get_primary_substations(limit=10)
print(f"Total primary substations: {primary_substations.total_count}")
print(f"Records returned: {len(primary_substations.records)}")
print("\nFirst 5 primary substations:")
print("-" * 50)
for record in primary_substations.records[:5]:
if record.fields:
name = record.fields.get("sitename", "Unknown")
area = record.fields.get("towncity", "Unknown")
print(f" {name} ({area})")
# Expected output:
# Total primary substations: 150
# Records returned: 10
#
# First 5 primary substations:
# --------------------------------------------------
# MANGANESE BRONZE PRIMARY 33kV (IPSWICH WEST)
# DOWNHAM MARKET PRIMARY 33kV (DOWNHAM MARKET)
# ...
# Filter by licence area (e.g., London Power Networks)
london_substations = gis.get_primary_substations(
licence_area="London Power Networks (LPN)", limit=10
)
print(f"London primary substations: {london_substations.total_count}")
print("\nLondon substations:")
print("-" * 50)
for record in london_substations.records[:5]:
if record.fields:
name = record.fields.get("sitename", "Unknown")
site_id = record.fields.get("sitefunctionallocation", "Unknown")
print(f" {name}")
print(f" Site ID: {site_id}")
# Expected output:
# London primary substations: 45
#
# London substations:
# --------------------------------------------------
# BARKING PRIMARY 33kV
# Site ID: LPN-S000000012345
# BRIMSDOWN PRIMARY 33kV
# Site ID: LPN-S000000067890
# ...
# Examine the fields available in primary substation data
if primary_substations.records:
first_record = primary_substations.records[0]
if first_record.fields:
print("Available fields in primary substations data:")
print("-" * 50)
for field_name, value in first_record.fields.items():
print(f" {field_name}: {type(value).__name__}")
# Expected output:
# Available fields in primary substations data:
# --------------------------------------------------
# sitename: str
# licence_area: str
# spatial_coordinates: dict
# ...
5. Getting Secondary Sites Data¶
Secondary sites are distribution substations that step down voltage for delivery to customers. Use geo.get_secondary_sites() to retrieve this data.
You can filter by parent primary substation using the primary_substation parameter.
# Get secondary sites (all)
secondary_sites = gis.get_secondary_sites(limit=10)
print(f"Total secondary sites: {secondary_sites.total_count}")
print(f"Records returned: {len(secondary_sites.records)}")
print("\nFirst 5 secondary sites:")
print("-" * 60)
for record in secondary_sites.records[:5]:
if record.fields:
site_id = record.fields.get("functionallocation", "Unknown")
primary = record.fields.get("primaryfeederfunctionallocation", "Unknown")
print(f" Site: {site_id} (Primary: {primary})")
# Expected output:
# Total secondary sites: 85000
# Records returned: 10
#
# First 5 secondary sites:
# ------------------------------------------------------------
# Site: ABC123 (Primary: BARKING)
# Site: DEF456 (Primary: BARKING)
# ...
# Get secondary sites for a specific primary substation
# First, get a valid primary substation name
primary_substation = gis.get_primary_substations(site="SPN-S000000008608")
if primary_substation.records and primary_substation.records[0].fields:
primary_name = primary_substation.records[0].fields.get("sitename", "Unknown")
primary_id = primary_substation.records[0].fields.get(
"sitefunctionallocation", "Unknown"
)
print(f"Using primary substation: {primary_name}")
if primary_name:
sites_for_primary = gis.get_secondary_sites(
primary_substation=primary_id, limit=10
)
print(f"Secondary sites for {primary_name}: {sites_for_primary.total_count}")
print("\nSites:")
print("-" * 50)
for record in sites_for_primary.records[:5]:
if record.fields:
site_id = record.fields.get("functionallocation", "Unknown")
print(f" {site_id}")
# Expected output:
# Secondary sites for BARKING: 250
#
# Sites:
# --------------------------------------------------
# ABC123
# DEF456
# ...
6. Getting Overhead Lines (HV, LV)¶
Overhead lines carry electricity across the distribution network. Use geo.get_overhead_lines() with the voltage parameter to get either high voltage (HV) or low voltage (LV) lines.
voltage="hv"- High voltage overhead lines (11kV)voltage="lv"- Low voltage overhead lines (400V)
# Get high voltage overhead lines
hv_lines = gis.get_overhead_lines(voltage="hv", limit=10)
print(f"Total HV overhead lines: {hv_lines.total_count}")
print(f"Records returned: {len(hv_lines.records)}")
print("\nFirst 5 HV overhead lines:")
print("-" * 60)
for record in hv_lines.records[:5]:
if record.fields:
line_id = record.fields.get("line_id", record.id)
length = record.fields.get("length_km", "N/A")
print(f" Line: {line_id} (Length: {length} km)")
# Expected output:
# Total HV overhead lines: 15000
# Records returned: 10
#
# First 5 HV overhead lines:
# ------------------------------------------------------------
# Line: HV001 (Length: 2.5 km)
# Line: HV002 (Length: 1.8 km)
# ...
# Get low voltage overhead lines
lv_lines = gis.get_overhead_lines(voltage="lv", limit=10)
print(f"Total LV overhead lines: {lv_lines.total_count}")
print(f"Records returned: {len(lv_lines.records)}")
print("\nFirst 5 LV overhead lines:")
print("-" * 60)
for record in lv_lines.records[:5]:
if record.fields:
line_id = record.fields.get("line_id", record.id)
print(f" Line: {line_id}")
# Expected output:
# Total LV overhead lines: 45000
# Records returned: 10
#
# First 5 LV overhead lines:
# ------------------------------------------------------------
# Line: LV001
# Line: LV002
# ...
# Examine fields available in overhead line data
if hv_lines.records:
first_record = hv_lines.records[0]
if first_record.fields:
print("Available fields in HV overhead lines data:")
print("-" * 50)
for field_name, value in first_record.fields.items():
value_preview = (
str(value)[:50] + "..." if len(str(value)) > 50 else str(value)
)
print(f" {field_name}: {value_preview}")
# Expected output:
# Available fields in HV overhead lines data:
# --------------------------------------------------
# geo_shape: {'type': 'LineString', 'coordinates': [...
# ...
7. Getting Poles (HV, LV)¶
Poles support the overhead line network. Use geo.get_poles() with the voltage parameter to get either high voltage (HV) or low voltage (LV) poles.
voltage="hv"- High voltage polesvoltage="lv"- Low voltage poles
# Get high voltage poles
hv_poles = gis.get_poles(voltage="hv", limit=10)
print(f"Total HV poles: {hv_poles.total_count}")
print(f"Records returned: {len(hv_poles.records)}")
print("\nFirst 5 HV poles:")
print("-" * 60)
for record in hv_poles.records[:5]:
if record.fields:
pole_id = record.fields.get("pole_id", record.id)
print(f" Pole: {pole_id}")
# Expected output:
# Total HV poles: 50000
# Records returned: 10
#
# First 5 HV poles:
# ------------------------------------------------------------
# Pole: HVP001
# Pole: HVP002
# ...
# Get low voltage poles
lv_poles = gis.get_poles(voltage="lv", limit=10)
print(f"Total LV poles: {lv_poles.total_count}")
print(f"Records returned: {len(lv_poles.records)}")
print("\nFirst 5 LV poles:")
print("-" * 60)
for record in lv_poles.records[:5]:
if record.fields:
pole_num = record.fields.get("pole_num", record.id)
print(f" Pole: {pole_num}")
# Expected output:
# Total LV poles: 120000
# Records returned: 10
#
# First 5 LV poles:
# ------------------------------------------------------------
# Pole: LVP001
# Pole: LVP002
# ...
# Examine fields available in pole data
if hv_poles.records:
first_record = hv_poles.records[0]
if first_record.fields:
print("Available fields in HV poles data:")
print("-" * 50)
for field_name, value in first_record.fields.items():
value_preview = (
str(value)[:50] + "..." if len(str(value)) > 50 else str(value)
)
print(f" {field_name}: {value_preview}")
# Expected output:
# Available fields in HV poles data:
# --------------------------------------------------
# pole_id: HVP001
# geo_point: {'lat': 51.5074, 'lon': -0.1278}
# ...
8. Exporting to GeoJSON Format¶
GeoJSON is a standard format for encoding geographic data structures. It's widely supported by GIS applications like QGIS, ArcGIS, and web mapping libraries like Leaflet and Mapbox.
Use geo.export_geojson() to export any dataset in GeoJSON format.
import json
# Export primary substations as GeoJSON
primary_geojson_bytes = gis.export_geojson("primary_areas", limit=10)
# Parse the GeoJSON
primary_geojson = json.loads(primary_geojson_bytes.decode("utf-8"))
print(f"GeoJSON type: {primary_geojson.get('type')}")
print(f"Number of features: {len(primary_geojson.get('features', []))}")
# Show first feature
if primary_geojson.get("features"):
first_feature = primary_geojson["features"][0]
print(
f"\nFirst feature geometry type: {first_feature.get('geometry', {}).get('type')}"
)
print(
f"First feature properties: {list(first_feature.get('properties', {}).keys())}"
)
# Expected output:
# GeoJSON type: FeatureCollection
# Number of features: 10
#
# First feature geometry type: MultiPolygon
# First feature properties: ['primary_substation', 'licence_area', ...]
# Export HV overhead lines as GeoJSON
hv_lines_geojson_bytes = gis.export_geojson("hv_overhead_lines", limit=10)
hv_lines_geojson = json.loads(hv_lines_geojson_bytes.decode("utf-8"))
print(f"GeoJSON type: {hv_lines_geojson.get('type')}")
print(f"Number of features: {len(hv_lines_geojson.get('features', []))}")
# Lines are typically LineString geometries
if hv_lines_geojson.get("features"):
first_feature = hv_lines_geojson["features"][0]
geometry_type = first_feature.get("geometry", {}).get("type")
print(f"\nGeometry type: {geometry_type}")
# Expected output:
# GeoJSON type: FeatureCollection
# Number of features: 10
#
# Geometry type: LineString
# Export HV poles as GeoJSON
hv_poles_geojson_bytes = gis.export_geojson("hv_poles", limit=10)
hv_poles_geojson = json.loads(hv_poles_geojson_bytes.decode("utf-8"))
print(f"GeoJSON type: {hv_poles_geojson.get('type')}")
print(f"Number of features: {len(hv_poles_geojson.get('features', []))}")
# Poles are typically Point geometries
if hv_poles_geojson.get("features"):
first_feature = hv_poles_geojson["features"][0]
geometry_type = first_feature.get("geometry", {}).get("type")
print(f"\nGeometry type: {geometry_type}")
# Expected output:
# GeoJSON type: FeatureCollection
# Number of features: 10
#
# Geometry type: Point
# Save GeoJSON to file for use in GIS applications (optional)
from pathlib import Path
# Export more records for a useful file
full_geojson_bytes = gis.export_geojson("primary_areas", limit=100)
save_dir = None # Set to a directory (e.g. "exports") to enable writing files.
if save_dir:
output_file = Path(save_dir) / "primary_substations.geojson"
output_file.parent.mkdir(parents=True, exist_ok=True)
with open(output_file, "wb") as f:
f.write(full_geojson_bytes)
print(f"Saved {len(full_geojson_bytes)} bytes to {output_file}")
print("\nYou can now open this file in:")
print(" - QGIS")
print(" - ArcGIS")
print(" - geojson.io")
print(" - Leaflet/Mapbox web maps")
else:
print(
f"Exported {len(full_geojson_bytes)} bytes (file save skipped; set save_dir to enable writing)."
)
# Expected output:
# Saved 125000 bytes to primary_substations.geojson
#
# You can now open this file in:
# - QGIS
# - ArcGIS
# - geojson.io
# - Leaflet/Mapbox web maps
Working with GeoJSON in Python¶
If you have geopandas installed, you can load GeoJSON directly into a GeoDataFrame for spatial analysis.
# Optional: Load GeoJSON into geopandas
# Install with: pip install geopandas
try:
from io import BytesIO
import geopandas as gpd
# Load directly from bytes
geojson_bytes = gis.export_geojson("primary_areas", limit=50)
gdf = gpd.read_file(BytesIO(geojson_bytes))
print(f"GeoDataFrame shape: {gdf.shape}")
print(f"CRS: {gdf.crs}")
print(f"\nColumns: {list(gdf.columns)}")
print(f"\nGeometry types: {gdf.geometry.geom_type.value_counts().to_dict()}")
# Display first few rows
display(gdf.head())
except ImportError:
print("geopandas not installed.")
print("Install with: pip install geopandas")
print(
"\nThis is optional - GeoJSON files can be used directly in GIS applications."
)
# Expected output (with geopandas):
# GeoDataFrame shape: (50, 5)
# CRS: EPSG:4326
#
# Columns: ['primary_substation', 'licence_area', 'geometry', ...]
#
# Geometry types: {'MultiPolygon': 50}
Visualizing GeoJSON Data¶
If you have both geopandas and matplotlib installed, you can create quick visualizations.
# Optional: Simple visualization with geopandas
try:
from io import BytesIO
import geopandas as gpd
import matplotlib.pyplot as plt
# Load data
geojson_bytes = gis.export_geojson("primary_areas", limit=50)
gdf = gpd.read_file(BytesIO(geojson_bytes))
# Create plot
fig, ax = plt.subplots(figsize=(12, 10))
gdf.plot(ax=ax, edgecolor="black", facecolor="lightblue", alpha=0.5)
ax.set_title("UKPN Primary Substation Areas")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
plt.tight_layout()
plt.show()
except ImportError as e:
print(f"Visualization requires geopandas and matplotlib: {e}")
print("\nInstall with:")
print(" pip install geopandas matplotlib")
# Expected output:
# [Map visualization of primary substation areas]
Summary¶
In this tutorial, you learned how to:
- Import the geo orchestrator:
from ukpyn import geo - List available datasets:
geo.available_datasets - Get primary substations:
geo.get_primary_substations(licence_area=...) - Get secondary sites:
geo.get_secondary_sites(primary_substation=...) - Get overhead lines:
geo.get_overhead_lines(voltage="hv")orgeo.get_overhead_lines(voltage="lv") - Get poles:
geo.get_poles(voltage="hv")orgeo.get_poles(voltage="lv") - Export to GeoJSON:
geo.export_geojson(dataset_name)
Quick Reference¶
from ukpyn import geo
# List datasets
geo.available_datasets
# Get infrastructure data
geo.get_primary_substations(licence_area="LPN", limit=100)
geo.get_secondary_sites(primary_substation="BARKING", limit=100)
geo.get_overhead_lines(voltage="hv", limit=100)
geo.get_overhead_lines(voltage="lv", limit=100)
geo.get_poles(voltage="hv", limit=100)
geo.get_poles(voltage="lv", limit=100)
# Export to GeoJSON
geojson_bytes = geo.export_geojson("primary_substations", limit=1000)
Next Steps¶
- Load GeoJSON files into QGIS or ArcGIS for detailed spatial analysis
- Combine infrastructure data with other UKPN datasets (demand, generation)
- Create web maps using Leaflet or Mapbox with the exported GeoJSON
- Perform spatial analysis using geopandas for buffer zones, intersections, etc.