Getting Started with ukpyn¶
Welcome to ukpyn!
ukpyn is a Python package aimed at improving accessibility of the data hosted on UK Power Networks' Open Data Portal. The Python package is created and maintained by UK Power Networks DSO Data Science & Development Team, you can contact us via the ukpyn GitHub repository issues board to start a conversation about the package and get any help.
We have prepared many tutorials and examples of how you might interact with the data on Open Data Portal, and combine them to unlock insights and value. We hope you find this information useful!
In this first tutorial, you will learn how to:
- Install ukpyn
- Set your API key
- Create a client and list datasets
- Read records from a dataset
- Export data in common formats
This notebook is written for beginners, so each step explains what is happening and why. We do assume that you have some knowledge of python with it installed and are able to work with a Jupyter Notebook.
Let's begin!
1. Installation¶
Before you can use ukpyn, you need to install it in your Python environment.
Why you need to install ukpyn?¶
ukpyn is a Python package (a reusable library) for working with UK Power Networks open data and is hosted on the Python Package Index (PyPI), a repository of software for the Python programming language.
Installing it adds that library to your Python environment so import ukpyn within notebooks and scripts will find the library.
Ensure you already have your Python environment setup already so that you are installing ukpyn there.
Without installation, Python does not know what ukpyn is as it won't find it in the environment and importing it will fail.
What are pip and uv?¶
pipis Python’s standard package installer.uvis a newer, faster package/dependency tool that can also install packages.
You can check you have these installed by simply typing pip or uv into your terminal and see if you get the appropriate response.
We recommend using uv for installations, found here.
Installing the package¶
The ukpyn package has it basic installation and full installation options.
If you are just going to use the package on your own and not work through the tutorials/examples, then you only need the basic installation.
If you want to work with these tutorials, you will need all the additional dependencies.
BASIC INSTALLATION
Run these in a terminal, not inside a Python code cell. In VS Code, open Terminal → New Terminal and run one of the following:
Option A: using pip
pip install ukpyn
Option B: using uv (faster, recommended)
uv add ukpyn
Both options install the ukpyn and its required dependencies (for example, httpx and pydantic).
FULL INSTALLATION
As we wish to be able to run all the tutorials and perhaps contribute to the development of ukpyn, we need the additional dependencies.
The Tutorials contain additional dependencies that are unassociated to the client manager itself. For example, we use some geospatial python packages in tutorials/08-geospatial-data.ipynb.
To operate within our tutorials, you will need to install additional dependencies by running one of the following:
Option A: using pip
pip install "ukpyn[all]"
Option B: using uv (faster, recommended)
uv add "ukpyn[all]"
2. Setting Up Your API Key¶
All of our data is available on the UK Power Networks Open Data Portal.
There you can manually export to recognisable file formats like .csv and .xlsx.
However, it is much more efficient for code projects to have a computer program fetch the data for you. That is why all our datasets are accessible via API.
What is an API?¶
API stands for Application Programming Interface. It is a set of rules that lets one software system request data or services from another, so your code can talk directly to platforms like the UK Power Networks Open Data Portal automatically.
Where data is locked behind user accounts, as our Open Data Portal requires, then we need to give your code some credentials to access the data. We call this an API key.
The API key should be kept private and not hardcoded into your scripts. To do this, you have options available to you. Firstly, let's generate your API key.
Generate your API key¶
- Open the UK Power Networks Open Data Portal
- Sign in (or create an account)
- Open your profile settings from the top-right menu
- Go to Portal Settings → API Keys
- Select Generate a new API key and give it a name like
ukpyn - Copy the generated key
Save the key as an environment variable¶
ukpyn looks for your API key from your environmental variables looking for a variable named UKPN_API_KEY.
You will need to create this environment variable for ukpyn to find. There are a few options.
Using a .env file (recommended):
Create a new file called .env in the current working directory of the notebook session, this is normally the project root where you opened your IDE.
We use dotenv that looks in the current directory then pareant directories for the .env file.
If you are still unsure where that is, you can print it from your terminal you can run the following:
python
import os; print(os.getcwd())
# a directory path will print to the screen, e.g. C:\projects\ukpyn
quit()
Now you have made your .env file, open it with software like Notepad or Notepad++ and add the following, replacing your_api_key_here with the key generated in the previous step:
UKPN_API_KEY=your_api_key_here
Linux/macOS:
export UKPN_API_KEY="your-api-key-here"
Windows (Command Prompt):
cmd
set UKPN_API_KEY=your-api-key-here
Windows (PowerShell):
$env:UKPN_API_KEY = "your-api-key-here"
import ukpyn
ukpyn.check_api_key()
print("API key configured!")
3. Creating a Client and Listing Datasets¶
A client is a Python object that sends requests to the API and returns results.
In this notebook, UKPNClient uses async methods, so API calls use await.
What does async mean here?
- Async methods are useful for network calls, where Python waits for the server response without blocking everything else.
awaitmeans: “pause this function here, let other tasks run, then resume when result is ready.”- When using the
UKPNClient, you will regularly see the commandawaitbefore immediately it, which just tells your program to relax whilst the client speaks to the Open Data Portal.
Let's create the client:
from ukpyn import UKPNClient
# Create a client instance
# The API key is automatically loaded from UKPN_API_KEY environment variable
client = UKPNClient()
# Or, pass the API key directly (not recommended for production code)
# client = UKPNClient(api_key="your-api-key-here")
# Display a summary of the API client
print(client.summary())
Listing Available Datasets¶
By this point, you have ukpyn installed, your API key configured, and have a client ready to go!
Now the fun begins.
Let's start by listing datasets so you can choose one to work with. The example below requests the first 5 datasets and prints their summaries.
# Find a dataset via a keyword in the name
# we use the where paramter to specify a search query that looks for the keyword "Smart" in any field of the dataset metadata
response = await client.list_datasets(where='search(*, "Smart")', limit=10)
print(f"Found {response.total_count} datasets matching 'Smart'\n")
for item in response.datasets:
print(f"- {item.title} ({item.id})")
# List the first 10 datasets
response = await client.list_datasets(limit=10)
print(f"Total datasets available: {response.total_count}")
print("Dataset summaries:\n")
for item in response.datasets:
dataset = item.dataset
print("-" * 60)
print(dataset.summary())
print()
# Expected output:
# Dataset(id='...', title='...', has_records=..., records=..., fields=...)
Searching for Datasets¶
Use where with the ODSQL search() function when you only know a keyword (for example, smart or flexibility).
This helps you quickly find likely dataset IDs before fetching records.
# Search for datasets containing "smart" in their metadata
response = await client.list_datasets(where='search(*, "smart")', limit=5)
print(f"Found {response.total_count} datasets matching 'smart'\n")
for item in response.datasets:
print(f"- {item.title} ({item.id})")
Getting Dataset Details¶
Once you have a dataset ID, request full details. This includes useful metadata and the list of available fields (columns).
# Get details for a specific dataset
# Replace with an actual dataset ID from the list above
dataset_id = "ukpn-smart-meter-installation-volumes"
dataset = await client.get_dataset(dataset_id)
# Detailed, human-readable breakdown (renders rich HTML in notebooks)
dataset.details()
4. Fetching Records from a Dataset¶
After choosing a dataset, you can fetch rows (called records) from it.
Start with a small limit so you can inspect the structure safely.
# Fetch 5 records from the dataset
dataset_id = "ukpn-smart-meter-installation-volumes"
records_response = await client.get_records(dataset_id, limit=5)
print(f"Total records in dataset: {records_response.total_count}")
print(f"Records returned: {len(records_response.records)}")
print("\n" + "=" * 60)
# Display each record's fields in a readable format
for i, record in enumerate(records_response.records, 1):
print(f"\nRecord {i} (ID: {record.id})")
print("-" * 40)
if record.fields:
for key, value in record.fields.items():
print(f" {key}: {value}")
Filtering Records¶
Use filters to return only the records you care about.
ukpyn supports ODSQL filters (OpenDataSoft Query Language) through query parameters like where and order_by.
# Example: Filter records with a WHERE clause
# Note: Adjust the field names and values based on your dataset
filtered_records = await client.get_records(
dataset_id, where='local_authority="Surrey" OR local_authority="Kent"'
)
print(f"\nFiltered results: {filtered_records.total_count} total records")
print(f"Showing first {len(filtered_records.records)} records\n")
for i, record in enumerate(filtered_records.records, 1):
print(f"Record {i} (ID: {record.id})")
print(record.fields)
print("-" * 40)
Selecting Specific Fields¶
Use select to return only the columns you need.
This makes responses smaller and easier to read, especially for wide datasets.
# Select only specific fields to reduce response size
try:
# Adjust field names based on your dataset
records = await client.get_records(
dataset_id,
limit=5,
# select="field1, field2", # Uncomment and adjust
)
print(f"Retrieved {len(records.records)} records")
# Display as a simple table
for record in records.records:
if record.fields:
print(record.fields)
except Exception as e:
print(f"Error: {e}")
Pagination¶
Large datasets are usually returned in pages.
Use limit (page size) and offset (start position) to move through results page by page.
# Paginate through records
page_size = 10
page_number = 0 # 0-indexed
try:
records = await client.get_records(
dataset_id,
limit=page_size,
offset=page_number * page_size,
)
total_pages = (records.total_count + page_size - 1) // page_size
print(f"Page {page_number + 1} of {total_pages}")
print(
f"Showing records {page_number * page_size + 1} to {page_number * page_size + len(records.records)}"
)
print(f"Total records: {records.total_count}")
except Exception as e:
print(f"Error: {e}")
5. Exporting Data to Different Formats¶
You can export data in formats that fit different tools and workflows:
json: good for APIs and nested structurescsv: simple tabular text, works in many toolsxlsx: Excel workbook formatgeojson: geospatial JSON formatshapefile: common GIS exchange formatkml: map format used by tools like Google Earth
from ukpyn import EXPORT_FORMATS
print("Available export formats:")
for fmt in EXPORT_FORMATS:
print(f" - {fmt}")
Export to CSV¶
CSV is a good default when you want to inspect or share table-like data quickly.
# Export dataset to CSV
try:
from pathlib import Path
csv_data = await client.export_data(
dataset_id,
format="csv",
limit=100, # Limit to 100 records for this example
)
save_dir = None # Set to a directory (e.g. "exports") to enable writing files.
if save_dir:
output_file = Path(save_dir) / "export.csv"
output_file.parent.mkdir(parents=True, exist_ok=True)
with open(output_file, "wb") as f:
f.write(csv_data)
print(f"Exported {len(csv_data)} bytes to {output_file}")
else:
print(
f"Exported {len(csv_data)} bytes (file save skipped; set save_dir to enable writing)."
)
# Preview first few lines
print("\nPreview (first 500 characters):")
print(csv_data.decode("utf-8")[:500])
except Exception as e:
print(f"Export error: {e}")
Export to JSON¶
JSON is useful when you want structured data for scripts, apps, or APIs.
import json
# Export dataset to JSON
try:
json_data = await client.export_data(
dataset_id,
format="json",
limit=10,
)
# Parse and pretty-print
data = json.loads(json_data)
print(json.dumps(data[:2], indent=2)) # Show first 2 records
except Exception as e:
print(f"Export error: {e}")
Export to Excel¶
Use Excel export when your audience prefers spreadsheet files.
# Export dataset to Excel
try:
from pathlib import Path
xlsx_data = await client.export_data(
dataset_id,
format="xlsx",
limit=100,
)
save_dir = None # Set to a directory (e.g. "exports") to enable writing files.
if save_dir:
output_file = Path(save_dir) / "export.xlsx"
output_file.parent.mkdir(parents=True, exist_ok=True)
with open(output_file, "wb") as f:
f.write(xlsx_data)
print(f"Exported {len(xlsx_data)} bytes to {output_file}")
print("Open the file in Excel or use pandas to read it.")
else:
print(
f"Exported {len(xlsx_data)} bytes (file save skipped; set save_dir to enable writing)."
)
except Exception as e:
print(f"Export error: {e}")
Working with pandas (optional)¶
If you use pandas, you can load exports into a DataFrame for analysis and charting.
# Optional: Use pandas for data analysis
# Make sure pandas is installed: pip install pandas
try:
from io import BytesIO
import pandas as pd
# Export to CSV and load into pandas
csv_data = await client.export_data(
dataset_id,
format="csv",
limit=100,
)
# Create DataFrame
df = pd.read_csv(BytesIO(csv_data), sep=";")
print(f"DataFrame shape: {df.shape}")
print(f"\nColumns: {list(df.columns)}")
print("\nFirst 5 rows:")
display(df.head())
except ImportError:
print("pandas is not installed. Install it with: pip install pandas")
except Exception as e:
print(f"Error: {e}")
Cleaning Up¶
Close the client when you are finished to release network resources cleanly.
# Close the client when done
await client.close()
print("Client closed.")
Using Context Managers (Recommended)¶
async with is the safest pattern because the client is closed automatically, even if an error happens.
# Recommended: Use async context manager for automatic cleanup
async with UKPNClient() as client:
datasets = await client.list_datasets(limit=3)
print(f"Found {datasets.total_count} datasets")
# Client is automatically closed when exiting the 'async with' block
print("Client automatically closed!")
Error Handling¶
ukpyn raises specific exception types so you can show clearer messages and recover gracefully.
from ukpyn import (
AuthenticationError,
NotFoundError,
RateLimitError,
ServerError,
UKPNError,
ValidationError,
)
async with UKPNClient() as client:
try:
# Try to access a non-existent dataset
dataset = await client.get_dataset("this-dataset-does-not-exist")
except NotFoundError as e:
print(f"Dataset not found: {e}")
except AuthenticationError as e:
print(f"Authentication failed: {e}")
print("Tip: Check that your UKPN_API_KEY is correct.")
except RateLimitError as e:
print(f"Rate limit exceeded: {e}")
if e.retry_after:
print(f"Try again in {e.retry_after} seconds.")
except ValidationError as e:
print(f"Invalid request: {e}")
except ServerError as e:
print(f"Server error: {e}")
print("Tip: The UKPN API might be experiencing issues. Try again later.")
except UKPNError as e:
# Catch-all for other API errors
print(f"API error: {e}")
Summary¶
You now know how to:
- Install
ukpyn - Configure
UKPN_API_KEY - Create a
UKPNClientand list datasets - Search datasets and inspect their fields
- Fetch records with filtering, field selection, and pagination
- Export data in common formats
- Handle common API errors
Next Steps¶
- Browse datasets on the UK Power Networks Open Data Portal
- Learn basic ODSQL filtering)
- Continue with the next
ukpyntutorials for domain-specific examples
You are ready to start exploring real UK Power Networks open data.