Getting Started with ukpyn¶

Welcome to ukpyn!

ukpyn is a Python package aimed at improving accessibility of the data hosted on UK Power Networks' Open Data Portal. The Python package is created and maintained by UK Power Networks DSO Data Science & Development Team, you can contact us via the ukpyn GitHub repository issues board to start a conversation about the package and get any help.

We have prepared many tutorials and examples of how you might interact with the data on Open Data Portal, and combine them to unlock insights and value. We hope you find this information useful!

In this first tutorial, you will learn how to:

Install ukpyn
Set your API key
Create a client and list datasets
Read records from a dataset
Export data in common formats

This notebook is written for beginners, so each step explains what is happening and why. We do assume that you have some knowledge of python with it installed and are able to work with a Jupyter Notebook.

Let's begin!

1. Installation¶

Before you can use ukpyn, you need to install it in your Python environment.

Why you need to install `ukpyn`?¶

ukpyn is a Python package (a reusable library) for working with UK Power Networks open data and is hosted on the Python Package Index (PyPI), a repository of software for the Python programming language.

Installing it adds that library to your Python environment so import ukpyn within notebooks and scripts will find the library. Ensure you already have your Python environment setup already so that you are installing ukpyn there.

Without installation, Python does not know what ukpyn is as it won't find it in the environment and importing it will fail.

What are `pip` and `uv`?¶

pip is Python’s standard package installer.
uv is a newer, faster package/dependency tool that can also install packages.

You can check you have these installed by simply typing pip or uv into your terminal and see if you get the appropriate response. We recommend using uv for installations, found here.

Installing the package¶

The ukpyn package has it basic installation and full installation options. If you are just going to use the package on your own and not work through the tutorials/examples, then you only need the basic installation. If you want to work with these tutorials, you will need all the additional dependencies.

BASIC INSTALLATION

Run these in a terminal, not inside a Python code cell. In VS Code, open Terminal → New Terminal and run one of the following:

Option A: using pip

pip install ukpyn

Option B: using uv (faster, recommended)

uv add ukpyn

Both options install the ukpyn and its required dependencies (for example, httpx and pydantic).

FULL INSTALLATION

As we wish to be able to run all the tutorials and perhaps contribute to the development of ukpyn, we need the additional dependencies.

The Tutorials contain additional dependencies that are unassociated to the client manager itself. For example, we use some geospatial python packages in tutorials/08-geospatial-data.ipynb. To operate within our tutorials, you will need to install additional dependencies by running one of the following:

Option A: using pip

pip install "ukpyn[all]"

Option B: using uv (faster, recommended)

uv add "ukpyn[all]"

2. Setting Up Your API Key¶

All of our data is available on the UK Power Networks Open Data Portal. There you can manually export to recognisable file formats like .csv and .xlsx. However, it is much more efficient for code projects to have a computer program fetch the data for you. That is why all our datasets are accessible via API.

What is an API?¶

API stands for Application Programming Interface. It is a set of rules that lets one software system request data or services from another, so your code can talk directly to platforms like the UK Power Networks Open Data Portal automatically.

Where data is locked behind user accounts, as our Open Data Portal requires, then we need to give your code some credentials to access the data. We call this an API key.

The API key should be kept private and not hardcoded into your scripts. To do this, you have options available to you. Firstly, let's generate your API key.

Generate your API key¶

Open the UK Power Networks Open Data Portal
Sign in (or create an account)
Open your profile settings from the top-right menu
Go to Portal Settings → API Keys
Select Generate a new API key and give it a name like ukpyn
Copy the generated key

Save the key as an environment variable¶

ukpyn looks for your API key from your environmental variables looking for a variable named UKPN_API_KEY. You will need to create this environment variable for ukpyn to find. There are a few options.

Using a .env file (recommended): Create a new file called .env in the current working directory of the notebook session, this is normally the project root where you opened your IDE. We use dotenv that looks in the current directory then pareant directories for the .env file. If you are still unsure where that is, you can print it from your terminal you can run the following:

python

import os; print(os.getcwd())
# a directory path will print to the screen, e.g. C:\projects\ukpyn

quit()

Now you have made your .env file, open it with software like Notepad or Notepad++ and add the following, replacing your_api_key_here with the key generated in the previous step:

UKPN_API_KEY=your_api_key_here

Linux/macOS:

export UKPN_API_KEY="your-api-key-here"

Windows (Command Prompt):

cmd
set UKPN_API_KEY=your-api-key-here

Windows (PowerShell):

$env:UKPN_API_KEY = "your-api-key-here"

In [ ]:

Copied!

import ukpyn

ukpyn.check_api_key()
print("API key configured!")
import ukpyn

ukpyn.check_api_key()
print("API key configured!")

3. Creating a Client and Listing Datasets¶

A client is a Python object that sends requests to the API and returns results. In this notebook, UKPNClient uses async methods, so API calls use await.

What does async mean here?

Async methods are useful for network calls, where Python waits for the server response without blocking everything else.
await means: “pause this function here, let other tasks run, then resume when result is ready.”
When using the UKPNClient, you will regularly see the command await before immediately it, which just tells your program to relax whilst the client speaks to the Open Data Portal.

Let's create the client:

In [ ]:

Copied!





from ukpyn import UKPNClient

# Create a client instance
# The API key is automatically loaded from UKPN_API_KEY environment variable
client = UKPNClient()

# Or, pass the API key directly (not recommended for production code)
# client = UKPNClient(api_key="your-api-key-here")

# Display a summary of the API client
print(client.summary())
from ukpyn import UKPNClient

# Create a client instance
# The API key is automatically loaded from UKPN_API_KEY environment variable
client = UKPNClient()

# Or, pass the API key directly (not recommended for production code)
# client = UKPNClient(api_key="your-api-key-here")

# Display a summary of the API client
print(client.summary())

Listing Available Datasets¶

By this point, you have ukpyn installed, your API key configured, and have a client ready to go! Now the fun begins.

Let's start by listing datasets so you can choose one to work with. The example below requests the first 5 datasets and prints their summaries.

In [ ]:

Copied!

# Find a dataset via a keyword in the name
# we use the where paramter to specify a search query that looks for the keyword "Smart" in any field of the dataset metadata

response = await client.list_datasets(where='search(*, "Smart")', limit=10)

print(f"Found {response.total_count} datasets matching 'Smart'\n")

for item in response.datasets:
    print(f"- {item.title} ({item.id})")
# Find a dataset via a keyword in the name
# we use the where paramter to specify a search query that looks for the keyword "Smart" in any field of the dataset metadata

response = await client.list_datasets(where='search(*, "Smart")', limit=10)

print(f"Found {response.total_count} datasets matching 'Smart'\n")

for item in response.datasets:
    print(f"- {item.title} ({item.id})")

In [ ]:

Copied!





# List the first 10 datasets
response = await client.list_datasets(limit=10)

print(f"Total datasets available: {response.total_count}")
print("Dataset summaries:\n")

for item in response.datasets:
    dataset = item.dataset
    print("-" * 60)
    print(dataset.summary())
    print()

# Expected output:
# Dataset(id='...', title='...', has_records=..., records=..., fields=...)
# List the first 10 datasets
response = await client.list_datasets(limit=10)

print(f"Total datasets available: {response.total_count}")
print("Dataset summaries:\n")

for item in response.datasets:
    dataset = item.dataset
    print("-" * 60)
    print(dataset.summary())
    print()

# Expected output:
# Dataset(id='...', title='...', has_records=..., records=..., fields=...)

Searching for Datasets¶

Use where with the ODSQL search() function when you only know a keyword (for example, smart or flexibility). This helps you quickly find likely dataset IDs before fetching records.

In [ ]:

Copied!

# Search for datasets containing "smart" in their metadata
response = await client.list_datasets(where='search(*, "smart")', limit=5)

print(f"Found {response.total_count} datasets matching 'smart'\n")

for item in response.datasets:
    print(f"- {item.title} ({item.id})")
# Search for datasets containing "smart" in their metadata
response = await client.list_datasets(where='search(*, "smart")', limit=5)

print(f"Found {response.total_count} datasets matching 'smart'\n")

for item in response.datasets:
    print(f"- {item.title} ({item.id})")

Getting Dataset Details¶

Once you have a dataset ID, request full details. This includes useful metadata and the list of available fields (columns).

In [ ]:

Copied!





# Get details for a specific dataset
# Replace with an actual dataset ID from the list above
dataset_id = "ukpn-smart-meter-installation-volumes"
dataset = await client.get_dataset(dataset_id)

# Detailed, human-readable breakdown (renders rich HTML in notebooks)
dataset.details()
# Get details for a specific dataset
# Replace with an actual dataset ID from the list above
dataset_id = "ukpn-smart-meter-installation-volumes"
dataset = await client.get_dataset(dataset_id)

# Detailed, human-readable breakdown (renders rich HTML in notebooks)
dataset.details()

4. Fetching Records from a Dataset¶

After choosing a dataset, you can fetch rows (called records) from it. Start with a small limit so you can inspect the structure safely.

In [ ]:

Copied!





# Fetch 5 records from the dataset
dataset_id = "ukpn-smart-meter-installation-volumes"
records_response = await client.get_records(dataset_id, limit=5)

print(f"Total records in dataset: {records_response.total_count}")
print(f"Records returned: {len(records_response.records)}")
print("\n" + "=" * 60)

# Display each record's fields in a readable format
for i, record in enumerate(records_response.records, 1):
    print(f"\nRecord {i} (ID: {record.id})")
    print("-" * 40)
    if record.fields:
        for key, value in record.fields.items():
            print(f"  {key}: {value}")
# Fetch 5 records from the dataset
dataset_id = "ukpn-smart-meter-installation-volumes"
records_response = await client.get_records(dataset_id, limit=5)

print(f"Total records in dataset: {records_response.total_count}")
print(f"Records returned: {len(records_response.records)}")
print("\n" + "=" * 60)

# Display each record's fields in a readable format
for i, record in enumerate(records_response.records, 1):
    print(f"\nRecord {i} (ID: {record.id})")
    print("-" * 40)
    if record.fields:
        for key, value in record.fields.items():
            print(f"  {key}: {value}")

Filtering Records¶

Use filters to return only the records you care about. ukpyn supports ODSQL filters (OpenDataSoft Query Language) through query parameters like where and order_by.

In [ ]:

Copied!





# Example: Filter records with a WHERE clause
# Note: Adjust the field names and values based on your dataset

filtered_records = await client.get_records(
    dataset_id, where='local_authority="Surrey" OR local_authority="Kent"'
)

print(f"\nFiltered results: {filtered_records.total_count} total records")
print(f"Showing first {len(filtered_records.records)} records\n")

for i, record in enumerate(filtered_records.records, 1):
    print(f"Record {i} (ID: {record.id})")
    print(record.fields)
    print("-" * 40)
# Example: Filter records with a WHERE clause
# Note: Adjust the field names and values based on your dataset

filtered_records = await client.get_records(
    dataset_id, where='local_authority="Surrey" OR local_authority="Kent"'
)

print(f"\nFiltered results: {filtered_records.total_count} total records")
print(f"Showing first {len(filtered_records.records)} records\n")

for i, record in enumerate(filtered_records.records, 1):
    print(f"Record {i} (ID: {record.id})")
    print(record.fields)
    print("-" * 40)

Selecting Specific Fields¶

Use select to return only the columns you need. This makes responses smaller and easier to read, especially for wide datasets.

In [ ]:

Copied!





# Select only specific fields to reduce response size
try:
    # Adjust field names based on your dataset
    records = await client.get_records(
        dataset_id,
        limit=5,
        # select="field1, field2",  # Uncomment and adjust
    )

    print(f"Retrieved {len(records.records)} records")

    # Display as a simple table
    for record in records.records:
        if record.fields:
            print(record.fields)

except Exception as e:
    print(f"Error: {e}")
# Select only specific fields to reduce response size
try:
    # Adjust field names based on your dataset
    records = await client.get_records(
        dataset_id,
        limit=5,
        # select="field1, field2",  # Uncomment and adjust
    )

    print(f"Retrieved {len(records.records)} records")

    # Display as a simple table
    for record in records.records:
        if record.fields:
            print(record.fields)

except Exception as e:
    print(f"Error: {e}")

Pagination¶

Large datasets are usually returned in pages. Use limit (page size) and offset (start position) to move through results page by page.

In [ ]:

Copied!





# Paginate through records
page_size = 10
page_number = 0  # 0-indexed

try:
    records = await client.get_records(
        dataset_id,
        limit=page_size,
        offset=page_number * page_size,
    )

    total_pages = (records.total_count + page_size - 1) // page_size

    print(f"Page {page_number + 1} of {total_pages}")
    print(
        f"Showing records {page_number * page_size + 1} to {page_number * page_size + len(records.records)}"
    )
    print(f"Total records: {records.total_count}")

except Exception as e:
    print(f"Error: {e}")
# Paginate through records
page_size = 10
page_number = 0  # 0-indexed

try:
    records = await client.get_records(
        dataset_id,
        limit=page_size,
        offset=page_number * page_size,
    )

    total_pages = (records.total_count + page_size - 1) // page_size

    print(f"Page {page_number + 1} of {total_pages}")
    print(
        f"Showing records {page_number * page_size + 1} to {page_number * page_size + len(records.records)}"
    )
    print(f"Total records: {records.total_count}")

except Exception as e:
    print(f"Error: {e}")

5. Exporting Data to Different Formats¶

You can export data in formats that fit different tools and workflows:

json: good for APIs and nested structures
csv: simple tabular text, works in many tools
xlsx: Excel workbook format
geojson: geospatial JSON format
shapefile: common GIS exchange format
kml: map format used by tools like Google Earth

In [ ]:

Copied!

from ukpyn import EXPORT_FORMATS

print("Available export formats:")
for fmt in EXPORT_FORMATS:
    print(f"  - {fmt}")
from ukpyn import EXPORT_FORMATS

print("Available export formats:")
for fmt in EXPORT_FORMATS:
    print(f"  - {fmt}")

Export to CSV¶

CSV is a good default when you want to inspect or share table-like data quickly.

In [ ]:

Copied!





# Export dataset to CSV
try:
    from pathlib import Path

    csv_data = await client.export_data(
        dataset_id,
        format="csv",
        limit=100,  # Limit to 100 records for this example
    )

    save_dir = None  # Set to a directory (e.g. "exports") to enable writing files.
    if save_dir:
        output_file = Path(save_dir) / "export.csv"
        output_file.parent.mkdir(parents=True, exist_ok=True)
        with open(output_file, "wb") as f:
            f.write(csv_data)
        print(f"Exported {len(csv_data)} bytes to {output_file}")
    else:
        print(
            f"Exported {len(csv_data)} bytes (file save skipped; set save_dir to enable writing)."
        )

    # Preview first few lines
    print("\nPreview (first 500 characters):")
    print(csv_data.decode("utf-8")[:500])

except Exception as e:
    print(f"Export error: {e}")
# Export dataset to CSV
try:
    from pathlib import Path

    csv_data = await client.export_data(
        dataset_id,
        format="csv",
        limit=100,  # Limit to 100 records for this example
    )

    save_dir = None  # Set to a directory (e.g. "exports") to enable writing files.
    if save_dir:
        output_file = Path(save_dir) / "export.csv"
        output_file.parent.mkdir(parents=True, exist_ok=True)
        with open(output_file, "wb") as f:
            f.write(csv_data)
        print(f"Exported {len(csv_data)} bytes to {output_file}")
    else:
        print(
            f"Exported {len(csv_data)} bytes (file save skipped; set save_dir to enable writing)."
        )

    # Preview first few lines
    print("\nPreview (first 500 characters):")
    print(csv_data.decode("utf-8")[:500])

except Exception as e:
    print(f"Export error: {e}")

Export to JSON¶

JSON is useful when you want structured data for scripts, apps, or APIs.

In [ ]:

Copied!





import json

# Export dataset to JSON
try:
    json_data = await client.export_data(
        dataset_id,
        format="json",
        limit=10,
    )

    # Parse and pretty-print
    data = json.loads(json_data)
    print(json.dumps(data[:2], indent=2))  # Show first 2 records

except Exception as e:
    print(f"Export error: {e}")
import json

# Export dataset to JSON
try:
    json_data = await client.export_data(
        dataset_id,
        format="json",
        limit=10,
    )

    # Parse and pretty-print
    data = json.loads(json_data)
    print(json.dumps(data[:2], indent=2))  # Show first 2 records

except Exception as e:
    print(f"Export error: {e}")

Export to Excel¶

Use Excel export when your audience prefers spreadsheet files.

In [ ]:

Copied!





# Export dataset to Excel
try:
    from pathlib import Path

    xlsx_data = await client.export_data(
        dataset_id,
        format="xlsx",
        limit=100,
    )

    save_dir = None  # Set to a directory (e.g. "exports") to enable writing files.
    if save_dir:
        output_file = Path(save_dir) / "export.xlsx"
        output_file.parent.mkdir(parents=True, exist_ok=True)
        with open(output_file, "wb") as f:
            f.write(xlsx_data)
        print(f"Exported {len(xlsx_data)} bytes to {output_file}")
        print("Open the file in Excel or use pandas to read it.")
    else:
        print(
            f"Exported {len(xlsx_data)} bytes (file save skipped; set save_dir to enable writing)."
        )

except Exception as e:
    print(f"Export error: {e}")
# Export dataset to Excel
try:
    from pathlib import Path

    xlsx_data = await client.export_data(
        dataset_id,
        format="xlsx",
        limit=100,
    )

    save_dir = None  # Set to a directory (e.g. "exports") to enable writing files.
    if save_dir:
        output_file = Path(save_dir) / "export.xlsx"
        output_file.parent.mkdir(parents=True, exist_ok=True)
        with open(output_file, "wb") as f:
            f.write(xlsx_data)
        print(f"Exported {len(xlsx_data)} bytes to {output_file}")
        print("Open the file in Excel or use pandas to read it.")
    else:
        print(
            f"Exported {len(xlsx_data)} bytes (file save skipped; set save_dir to enable writing)."
        )

except Exception as e:
    print(f"Export error: {e}")

Working with pandas (optional)¶

If you use pandas, you can load exports into a DataFrame for analysis and charting.

In [ ]:

Copied!





# Optional: Use pandas for data analysis
# Make sure pandas is installed: pip install pandas

try:
    from io import BytesIO

    import pandas as pd

    # Export to CSV and load into pandas
    csv_data = await client.export_data(
        dataset_id,
        format="csv",
        limit=100,
    )

    # Create DataFrame
    df = pd.read_csv(BytesIO(csv_data), sep=";")

    print(f"DataFrame shape: {df.shape}")
    print(f"\nColumns: {list(df.columns)}")
    print("\nFirst 5 rows:")
    display(df.head())

except ImportError:
    print("pandas is not installed. Install it with: pip install pandas")
except Exception as e:
    print(f"Error: {e}")
# Optional: Use pandas for data analysis
# Make sure pandas is installed: pip install pandas

try:
    from io import BytesIO

    import pandas as pd

    # Export to CSV and load into pandas
    csv_data = await client.export_data(
        dataset_id,
        format="csv",
        limit=100,
    )

    # Create DataFrame
    df = pd.read_csv(BytesIO(csv_data), sep=";")

    print(f"DataFrame shape: {df.shape}")
    print(f"\nColumns: {list(df.columns)}")
    print("\nFirst 5 rows:")
    display(df.head())

except ImportError:
    print("pandas is not installed. Install it with: pip install pandas")
except Exception as e:
    print(f"Error: {e}")

Cleaning Up¶

Close the client when you are finished to release network resources cleanly.

In [ ]:

Copied!

# Close the client when done
await client.close()
print("Client closed.")
# Close the client when done
await client.close()
print("Client closed.")

Using Context Managers (Recommended)¶

async with is the safest pattern because the client is closed automatically, even if an error happens.

In [ ]:

Copied!





# Recommended: Use async context manager for automatic cleanup
async with UKPNClient() as client:
    datasets = await client.list_datasets(limit=3)
    print(f"Found {datasets.total_count} datasets")

# Client is automatically closed when exiting the 'async with' block
print("Client automatically closed!")
# Recommended: Use async context manager for automatic cleanup
async with UKPNClient() as client:
    datasets = await client.list_datasets(limit=3)
    print(f"Found {datasets.total_count} datasets")

# Client is automatically closed when exiting the 'async with' block
print("Client automatically closed!")

Error Handling¶

ukpyn raises specific exception types so you can show clearer messages and recover gracefully.

In [ ]:

Copied!





from ukpyn import (
    AuthenticationError,
    NotFoundError,
    RateLimitError,
    ServerError,
    UKPNError,
    ValidationError,
)

async with UKPNClient() as client:
    try:
        # Try to access a non-existent dataset
        dataset = await client.get_dataset("this-dataset-does-not-exist")

    except NotFoundError as e:
        print(f"Dataset not found: {e}")

    except AuthenticationError as e:
        print(f"Authentication failed: {e}")
        print("Tip: Check that your UKPN_API_KEY is correct.")

    except RateLimitError as e:
        print(f"Rate limit exceeded: {e}")
        if e.retry_after:
            print(f"Try again in {e.retry_after} seconds.")

    except ValidationError as e:
        print(f"Invalid request: {e}")

    except ServerError as e:
        print(f"Server error: {e}")
        print("Tip: The UKPN API might be experiencing issues. Try again later.")

    except UKPNError as e:
        # Catch-all for other API errors
        print(f"API error: {e}")
from ukpyn import (
    AuthenticationError,
    NotFoundError,
    RateLimitError,
    ServerError,
    UKPNError,
    ValidationError,
)

async with UKPNClient() as client:
    try:
        # Try to access a non-existent dataset
        dataset = await client.get_dataset("this-dataset-does-not-exist")

    except NotFoundError as e:
        print(f"Dataset not found: {e}")

    except AuthenticationError as e:
        print(f"Authentication failed: {e}")
        print("Tip: Check that your UKPN_API_KEY is correct.")

    except RateLimitError as e:
        print(f"Rate limit exceeded: {e}")
        if e.retry_after:
            print(f"Try again in {e.retry_after} seconds.")

    except ValidationError as e:
        print(f"Invalid request: {e}")

    except ServerError as e:
        print(f"Server error: {e}")
        print("Tip: The UKPN API might be experiencing issues. Try again later.")

    except UKPNError as e:
        # Catch-all for other API errors
        print(f"API error: {e}")

Summary¶

You now know how to:

Install ukpyn
Configure UKPN_API_KEY
Create a UKPNClient and list datasets
Search datasets and inspect their fields
Fetch records with filtering, field selection, and pagination
Export data in common formats
Handle common API errors

Next Steps¶

Browse datasets on the UK Power Networks Open Data Portal
Learn basic ODSQL filtering
Continue with the next ukpyn tutorials for domain-specific examples

You are ready to start exploring real UK Power Networks open data.

Getting Started with ukpyn¶

1. Installation¶

Why you need to install ukpyn?¶

What are pip and uv?¶

Installing the package¶

2. Setting Up Your API Key¶

What is an API?¶

Generate your API key¶

Save the key as an environment variable¶

3. Creating a Client and Listing Datasets¶

Listing Available Datasets¶

Searching for Datasets¶

Getting Dataset Details¶

4. Fetching Records from a Dataset¶

Filtering Records¶

Selecting Specific Fields¶

Pagination¶

5. Exporting Data to Different Formats¶

Export to CSV¶

Export to JSON¶

Export to Excel¶

Working with pandas (optional)¶

Cleaning Up¶

Using Context Managers (Recommended)¶

Error Handling¶

Summary¶

Next Steps¶

Why you need to install `ukpyn`?¶

What are `pip` and `uv`?¶