Parsing CIM XML Network Data¶
This tutorial demonstrates how to parse a CIM (Common Information Model) XML file from UK Power Networks' LTDS publication and build a pandapower network from it.
What is CIM?¶
The Common Information Model (CIM) is an IEC standard (IEC 61970/61968) used by Distribution Network Operators to describe power system equipment, topology, and connectivity in a vendor-neutral XML/RDF format.
UKPN publishes LTDS network models as CIM XML files. Each file contains:
| CIM Class | Description |
|---|---|
Substation |
Named substations (Grid, Primary, etc.) |
VoltageLevel |
Voltage levels within a substation |
PowerTransformer / PowerTransformerEnd |
Transformers with impedances and ratings |
ACLineSegment |
Overhead lines and cables with R, X, B parameters |
EnergyConsumer |
Load points |
ConnectivityNode / Terminal |
Topology and connectivity |
PhotoVoltaicUnit, BatteryUnit, SynchronousMachine |
Generation and storage |
How to Get CIM Data¶
The LTDS CIM dataset is a "Shared" dataset that requires special access. To request access:
- Register and login to the UKPN Open Data Portal
- Visit the LTDS CIM page and complete the Shared Data Request Form
Once approved, CIM data is published as XML file attachments (one per licence area: EPN, SPN, LPN). You can download the XML files directly from the portal.
This tutorial includes a small example file so you can follow along without needing portal access.
Prerequisites¶
- Complete 01-getting-started.ipynb first
- Install pandapower:
pip install "ukpyn[all]"
The example file used here is cim-example.xml, an excerpt from the EPN licence area LTDS Equipment profile.
1. Load and Explore the CIM XML¶
import xml.etree.ElementTree as ET
from collections import Counter
from pathlib import Path
CIM_FILE = Path("cim-example.xml")
tree = ET.parse(CIM_FILE)
root = tree.getroot()
# CIM namespaces used in UKPN LTDS files
NS = {
"cim": "http://iec.ch/TC57/CIM100#",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"md": "http://iec.ch/TC57/61970-552/ModelDescription/1#",
"gb": "http://ofgem.gov.uk/ns/CIM/LTDS/Extensions#",
"eu": "http://iec.ch/TC57/CIM100-European#",
}
RDF_ID = "{http://www.w3.org/1999/02/22-rdf-syntax-ns#}ID"
RDF_RES = "{http://www.w3.org/1999/02/22-rdf-syntax-ns#}resource"
# Count element types
counter = Counter()
for elem in root:
tag = elem.tag.split("}")[-1] if "}" in elem.tag else elem.tag
counter[tag] += 1
print(f"Total CIM elements: {sum(counter.values()):,}")
print(f"Distinct types: {len(counter)}\n")
for tag, count in counter.most_common(15):
print(f" {tag:35s} {count:>6,}")
2. Extract Key Network Components¶
We'll build lookup dictionaries for the main CIM classes. Each element's rdf:ID
attribute is the unique key used to cross-reference related objects.
def get_text(elem, child_tag):
"""Get text content of a child element, or None."""
child = elem.find(child_tag, NS)
return child.text if child is not None and child.text else None
def get_ref(elem, child_tag):
"""Get rdf:resource reference (stripped of leading '#' and '_'), or None."""
child = elem.find(child_tag, NS)
if child is not None:
ref = child.get(RDF_RES, "")
return ref.lstrip("#_") if ref else None
return None
def get_float(elem, child_tag):
"""Get float value of a child element, or None."""
val = get_text(elem, child_tag)
return float(val) if val is not None else None
# --- BaseVoltage lookup (ID -> kV) ---
base_voltages = {}
for bv in root.findall("cim:BaseVoltage", NS):
bv_id = bv.get(RDF_ID).lstrip("_")
kv = get_float(bv, "cim:BaseVoltage.nominalVoltage")
base_voltages[bv_id] = kv
print(f"Base voltages: {len(base_voltages)}")
for _bv_id, kv in sorted(base_voltages.items(), key=lambda x: x[1] or 0):
print(f" {kv} kV")
import pandas as pd
# --- Substations ---
substations = {}
for sub in root.findall("cim:Substation", NS):
sub_id = sub.get(RDF_ID).lstrip("_")
substations[sub_id] = {
"name": get_text(sub, "cim:IdentifiedObject.name"),
"region": get_ref(sub, "cim:Substation.Region"),
}
print(f"Substations: {len(substations)}")
df_subs = pd.DataFrame.from_dict(substations, orient="index")
df_subs.head(10)
# --- Voltage Levels ---
voltage_levels = {}
for vl in root.findall("cim:VoltageLevel", NS):
vl_id = vl.get(RDF_ID).lstrip("_")
bv_ref = get_ref(vl, "cim:VoltageLevel.BaseVoltage")
voltage_levels[vl_id] = {
"name": get_text(vl, "cim:IdentifiedObject.name"),
"substation_id": get_ref(vl, "cim:VoltageLevel.Substation"),
"base_kv": base_voltages.get(bv_ref) if bv_ref else None,
}
print(f"Voltage levels: {len(voltage_levels)}")
df_vl = pd.DataFrame.from_dict(voltage_levels, orient="index")
df_vl["base_kv"].value_counts().sort_index()
3. Parse Transformers¶
CIM splits transformer data across two classes:
- PowerTransformer — the transformer itself (container, name)
- PowerTransformerEnd — each winding's electrical parameters (ratedS, ratedU, r, x)
We join them to get a complete picture.
# --- PowerTransformers ---
transformers = {}
for pt in root.findall("cim:PowerTransformer", NS):
pt_id = pt.get(RDF_ID).lstrip("_")
transformers[pt_id] = {
"name": get_text(pt, "cim:IdentifiedObject.name"),
"container": get_ref(pt, "cim:Equipment.EquipmentContainer"),
"ends": [], # populated below
}
# --- PowerTransformerEnds ---
for pte in root.findall("cim:PowerTransformerEnd", NS):
pt_ref = get_ref(pte, "cim:PowerTransformerEnd.PowerTransformer")
if pt_ref and pt_ref in transformers:
bv_ref = get_ref(pte, "cim:TransformerEnd.BaseVoltage")
transformers[pt_ref]["ends"].append(
{
"end_number": int(get_text(pte, "cim:TransformerEnd.endNumber") or 0),
"ratedS_mva": get_float(pte, "cim:PowerTransformerEnd.ratedS"),
"ratedU_kv": get_float(pte, "cim:PowerTransformerEnd.ratedU"),
"r_ohm": get_float(pte, "cim:PowerTransformerEnd.r"),
"x_ohm": get_float(pte, "cim:PowerTransformerEnd.x"),
"base_kv": base_voltages.get(bv_ref) if bv_ref else None,
}
)
print(f"Transformers: {len(transformers)}")
print(f"Transformer ends: {sum(len(t['ends']) for t in transformers.values())}")
# Show a sample
sample_id = next(k for k, v in transformers.items() if len(v["ends"]) == 2)
sample = transformers[sample_id]
print(f"\nSample: {sample['name']}")
for end in sorted(sample["ends"], key=lambda e: e["end_number"]):
print(
f" End {end['end_number']}: {end['ratedU_kv']} kV, {end['ratedS_mva']} MVA, "
f"R={end['r_ohm']}, X={end['x_ohm']}"
)
4. Parse AC Line Segments¶
lines = []
for seg in root.findall("cim:ACLineSegment", NS):
bv_ref = get_ref(seg, "cim:ConductingEquipment.BaseVoltage")
lines.append(
{
"id": seg.get(RDF_ID).lstrip("_"),
"name": get_text(seg, "cim:IdentifiedObject.name"),
"r_ohm": get_float(seg, "cim:ACLineSegment.r"),
"x_ohm": get_float(seg, "cim:ACLineSegment.x"),
"bch_s": get_float(seg, "cim:ACLineSegment.bch"),
"length_km": get_float(seg, "cim:Conductor.length"),
"base_kv": base_voltages.get(bv_ref) if bv_ref else None,
}
)
df_lines = pd.DataFrame(lines)
print(f"AC Line Segments: {len(df_lines)}")
df_lines.head(10)
# Voltage distribution of lines
print("Lines by voltage level:")
df_lines["base_kv"].value_counts().sort_index()
5. Parse Loads (EnergyConsumers)¶
loads = []
for ec in root.findall("cim:EnergyConsumer", NS):
loads.append(
{
"id": ec.get(RDF_ID).lstrip("_"),
"name": get_text(ec, "cim:IdentifiedObject.name"),
"description": get_text(ec, "cim:IdentifiedObject.description"),
"container": get_ref(ec, "cim:Equipment.EquipmentContainer"),
}
)
df_loads = pd.DataFrame(loads)
print(f"Energy Consumers (loads): {len(df_loads)}")
df_loads.head(10)
6. Parse Generation and Storage¶
CIM represents generation with several classes:
- SynchronousMachine — conventional generators
- PhotoVoltaicUnit — solar PV
- BatteryUnit — battery energy storage
- WindGeneratingUnit — wind turbines
# Solar PV
pv_units = []
for pv in root.findall("cim:PhotoVoltaicUnit", NS):
pv_units.append(
{
"name": get_text(pv, "cim:IdentifiedObject.name"),
"maxP_mw": get_float(pv, "cim:PowerElectronicsUnit.maxP"),
}
)
# Battery storage
batteries = []
for bat in root.findall("cim:BatteryUnit", NS):
batteries.append(
{
"name": get_text(bat, "cim:IdentifiedObject.name"),
"maxP_mw": get_float(bat, "cim:PowerElectronicsUnit.maxP"),
"ratedE_mwh": get_float(bat, "cim:BatteryUnit.ratedE"),
}
)
# Synchronous machines
sync_machines = []
for sm in root.findall("cim:SynchronousMachine", NS):
sync_machines.append(
{
"name": get_text(sm, "cim:IdentifiedObject.name"),
"maxExportP_mw": get_float(sm, "cim:RotatingMachine.maxExportP"),
}
)
# Wind
wind_units = []
for wu in root.findall("cim:WindGeneratingUnit", NS):
wind_units.append(
{
"name": get_text(wu, "cim:IdentifiedObject.name"),
"maxP_mw": get_float(wu, "cim:GeneratingUnit.maxOperatingP"),
}
)
print(
f"Solar PV units: {len(pv_units):>4} — Total {sum(p['maxP_mw'] or 0 for p in pv_units):.1f} MW"
)
print(
f"Battery units: {len(batteries):>4} — Total {sum(b['maxP_mw'] or 0 for b in batteries):.1f} MW"
)
print(
f"Synchronous machines: {len(sync_machines):>4} — Total {sum(s['maxExportP_mw'] or 0 for s in sync_machines):.1f} MW"
)
print(f"Wind units: {len(wind_units):>4}")
7. Build a pandapower Network from CIM Data¶
We'll select a single substation with 33/11 kV transformers and build a small pandapower model from the CIM data. This mirrors the approach in 09-pandapower-import.ipynb but sourced directly from XML.
# Find substations that have 2-winding transformers with both HV and LV data
# First, map EquipmentContainer (substation/voltage-level) -> substation name
container_to_sub = {}
for vl_id, vl in voltage_levels.items():
sub_id = vl["substation_id"]
if sub_id and sub_id in substations:
container_to_sub[vl_id] = substations[sub_id]["name"]
for sub_id, sub in substations.items():
container_to_sub[sub_id] = sub["name"]
# Find transformers with exactly 2 ends and real impedance data
good_trafos = []
for pt_id, pt in transformers.items():
ends = pt["ends"]
if len(ends) != 2:
continue
ends_sorted = sorted(ends, key=lambda e: e["end_number"])
hv_end = ends_sorted[0] # end 1 = HV
lv_end = ends_sorted[1] # end 2 = LV
if hv_end["ratedS_mva"] and hv_end["ratedU_kv"] and hv_end["x_ohm"]:
sub_name = container_to_sub.get(pt["container"], "Unknown")
good_trafos.append(
{
"id": pt_id,
"name": pt["name"],
"substation": sub_name,
"hv_kv": hv_end["ratedU_kv"],
"lv_kv": lv_end["ratedU_kv"],
"sn_mva": hv_end["ratedS_mva"],
"r_ohm": hv_end["r_ohm"],
"x_ohm": hv_end["x_ohm"],
}
)
df_trafos = pd.DataFrame(good_trafos)
print(f"Transformers with complete 2-winding data: {len(df_trafos)}")
# Show substations with the most transformers
print("\nTop substations by transformer count:")
df_trafos["substation"].value_counts().head(10)
import math
import pandapower as pp
# Pick a substation — choose one with a few transformers
sub_counts = df_trafos["substation"].value_counts()
TARGET_SUB = sub_counts[(sub_counts >= 2) & (sub_counts <= 6)].index[0]
print(f"Selected substation: {TARGET_SUB}")
df_sub_trafos = df_trafos[df_trafos["substation"] == TARGET_SUB].copy()
display(df_sub_trafos[["name", "hv_kv", "lv_kv", "sn_mva", "r_ohm", "x_ohm"]])
# Build the pandapower network
net = pp.create_empty_network(name=f"CIM — {TARGET_SUB}")
# Create buses from unique voltage levels
bus_map = {} # (name, kv) -> bus index
for _, row in df_sub_trafos.iterrows():
for _side, kv_col in [("HV", "hv_kv"), ("LV", "lv_kv")]:
kv = row[kv_col]
bus_name = f"{TARGET_SUB} {kv:.0f}kV"
key = (bus_name, kv)
if key not in bus_map:
bus_map[key] = pp.create_bus(net, vn_kv=kv, name=bus_name)
# Create transformers
for _, row in df_sub_trafos.iterrows():
hv_bus = bus_map[(f"{TARGET_SUB} {row['hv_kv']:.0f}kV", row["hv_kv"])]
lv_bus = bus_map[(f"{TARGET_SUB} {row['lv_kv']:.0f}kV", row["lv_kv"])]
sn_mva = row["sn_mva"]
vn_hv = row["hv_kv"]
# Convert ohmic impedance to percent on transformer base
z_base = (vn_hv**2) / sn_mva
vkr_percent = (row["r_ohm"] / z_base) * 100
vkx_percent = (row["x_ohm"] / z_base) * 100
vk_percent = math.sqrt(vkr_percent**2 + vkx_percent**2)
pp.create_transformer_from_parameters(
net,
hv_bus=hv_bus,
lv_bus=lv_bus,
sn_mva=sn_mva,
vn_hv_kv=vn_hv,
vn_lv_kv=row["lv_kv"],
vk_percent=vk_percent,
vkr_percent=vkr_percent,
pfe_kw=0,
i0_percent=0.1,
name=row["name"],
)
# External grid at the HV bus
gsp_bus = net.bus["vn_kv"].idxmax()
pp.create_ext_grid(net, bus=gsp_bus, vm_pu=1.0, name="Grid")
# Add a representative load on each LV bus
for bus_idx in net.bus.index:
if bus_idx == gsp_bus:
continue
pp.create_load(
net,
bus=bus_idx,
p_mw=5.0,
q_mvar=1.5,
name=f"Load {net.bus.loc[bus_idx, 'name']}",
)
print(f"Buses: {len(net.bus)}, Transformers: {len(net.trafo)}, Loads: {len(net.load)}")
display(net.bus)
display(
net.trafo[["name", "sn_mva", "vn_hv_kv", "vn_lv_kv", "vk_percent", "vkr_percent"]]
)
8. Run Power Flow¶
pp.runpp(net, algorithm="nr", calculate_voltage_angles=True)
print("Power flow converged!\n")
print("Bus Results:")
display(net.res_bus.join(net.bus["name"]))
print("\nTransformer Loading:")
trafo_results = net.res_trafo[["p_hv_mw", "q_hv_mvar", "loading_percent"]].copy()
trafo_results.insert(0, "name", net.trafo["name"])
display(trafo_results)
9. Network Statistics Summary¶
print("=" * 50)
print(f"CIM FILE SUMMARY — {CIM_FILE.name}")
print("=" * 50)
print(f" Substations: {len(substations):>6,}")
print(f" Voltage levels: {len(voltage_levels):>6,}")
print(f" Transformers: {len(transformers):>6,}")
print(f" AC line segments: {len(df_lines):>6,}")
print(f" Loads: {len(df_loads):>6,}")
print(
f" Solar PV: {len(pv_units):>6,} ({sum(p['maxP_mw'] or 0 for p in pv_units):.0f} MW)"
)
print(
f" Batteries: {len(batteries):>6,} ({sum(b['maxP_mw'] or 0 for b in batteries):.0f} MW)"
)
print(
f" Synchronous machines: {len(sync_machines):>6,} ({sum(s['maxExportP_mw'] or 0 for s in sync_machines):.0f} MW)"
)
print(f" Wind units: {len(wind_units):>6,}")
Summary¶
This tutorial demonstrated:
- Loading a CIM XML file and exploring its structure
- Extracting substations, voltage levels, transformers, lines, loads, and generation
- Building a pandapower network from CIM transformer impedance data
- Running a power flow analysis on the resulting model
Next Steps¶
- Extend the network by adding
ACLineSegmentdata as pandapower lines - Use
TerminalandConnectivityNodeelements to resolve full bus topology - Combine CIM structure with LTDS Table 3a demand data for realistic loading
- See 09-pandapower-import.ipynb for the tabular LTDS approach