Adding Multiple Sorting Interfaces#

If you have a session with multiple probes or the same recording is sorted with different algorithms, you may need to handle multiple spike sorting outputs. This how-to will guide you through the process of adding more than one sorting interface to your NWB files using NeuroConv.

Why Unit ID Management Matters#

When you have multiple sorting outputs (e.g., from different algorithms like Kilosort and MountainSort), they might have overlapping unit IDs (0, 1, 2, etc.). If they do, adding them naively in NeuroConv will lead to the rows corresponding to the second sorter being skipped, as NeuroConv will identify them as the same unit and skip them to avoid duplicates. To handle this problem you have two main approaches:

  1. Rename units to create unique identifiers before merging into the

    canonical Units table

  2. Keep separate tables for each sorter in the processing module to maintain original

    sorter IDs

Setting Up the Example#

First, let’s create two mock sorting interfaces to demonstrate the concepts:

from neuroconv.tools.testing.mock_interfaces import MockSortingInterface
from neuroconv import ConverterPipe

# Create two sorting interfaces with overlapping unit IDs
sorting_interface1 = MockSortingInterface(num_units=4)
sorting_interface2 = MockSortingInterface(num_units=4)

print("Sorting 1 unit IDs:", sorting_interface1.units_ids)
print("Sorting 2 unit IDs:", sorting_interface2.units_ids)

Expected output:

Sorting 1 unit IDs: ['0', '1', '2', '3']
Sorting 2 unit IDs: ['0', '1', '2', '3']

The units from both sorting interfaces have the same IDs, which will cause conflicts.

Approach 1: Canonical Units Table with Unit Renaming#

This approach merges all sorting results into the main NWB Units table after renaming units to avoid ID conflicts.

Step 1: Add First Sorting to NWB File#

# Create NWB file with first sorting (no renaming needed)
nwbfile = sorting_interface1.create_nwbfile()
print("Units after adding first sorting:")
print(nwbfile.units.to_dataframe()[['unit_name']])

Expected output:

Units after adding first sorting:
id  unit_name
0           0
1           1
2           2
3           3

Step 2: Rename Units in Second Sorting#

Before adding the second sorting, we need to rename its units to avoid conflicts:

# Method 1: Using the new rename_unit_ids method with dictionary mapping
unit_rename_map = {
    '0': 'sorter2_unit_0',
    '1': 'sorter2_unit_1',
    '2': 'sorter2_unit_2',
    '3': 'sorter2_unit_3'
}
mock_sorting2.rename_unit_ids(unit_rename_map)

print("Sorting 2 unit IDs after renaming:", mock_sorting2.units_ids)

Expected output:

Sorting 2 unit IDs after renaming: ['sorter2_unit_0', 'sorter2_unit_1', 'sorter2_unit_2', 'sorter2_unit_3']

Step 3: Add Second Sorting to Existing NWB File#

# Add the renamed sorting to the existing NWB file
mock_sorting2.add_to_nwbfile(nwbfile=nwbfile)

print("Units after adding both sortings:")
units_df = nwbfile.units.to_dataframe()
print(units_df[['unit_name']])

Expected output:

Units after adding both sortings:
id      unit_name
0           0
1           1
2           2
3           3
4   sorter2_unit_0
5   sorter2_unit_1
6   sorter2_unit_2
7   sorter2_unit_3

Advantages of This Approach#

  • All units are in the canonical Units table, making analysis easier

  • Creates session-unique unit identifiers

  • Standard location that analysis tools expect

Disadvantages#

  • Requires careful unit ID management

  • Original sorter IDs are lost unless preserved in unit properties

Approach 2: Separate Tables in Processing Module#

This approach keeps each sorting in its own table within the processing module, preserving original unit IDs.

# Create fresh sorting interfaces
mock_sorting1 = MockSortingInterface(num_units=4)
mock_sorting2 = MockSortingInterface(num_units=4)

# Set up data interfaces with descriptive names
data_interfaces = {
    "kilosort_sorting": mock_sorting1,
    "mountainsort_sorting": mock_sorting2,
}

# Create converter with both sortings
converter = ConverterPipe(data_interfaces=data_interfaces)

# Configure to write each sorting to separate processing tables
conversion_options = {
    "kilosort_sorting": {
        "write_as": "processing",
        "units_name": "UnitsKilosort",
        "units_description": "Units detected by Kilosort spike sorting algorithm"
    },
    "mountainsort_sorting": {
        "write_as": "processing",
        "units_name": "UnitsMountainSort",
        "units_description": "Units detected by MountainSort spike sorting algorithm"
    },
}

# Create NWB file with separate tables
nwbfile = converter.create_nwbfile(conversion_options=conversion_options)

print("Processing module contents:")
print(list(nwbfile.processing['ecephys'].data_interfaces.keys()))
# ['UnitsKilosort', 'UnitsMountainSort']
  • Preserves original unit IDs from each sorter

  • Clear provenance of which algorithm produced which units

  • No risk of ID conflicts

  • Analysis tools need to know which table to use

  • More complex to work with multiple tables

  • Units are not in the standard NWB Units location

Alternative Renaming Approaches#

You can also use more descriptive naming schemes:

# Descriptive naming based on sorting algorithm
descriptive_map = {
    '0': 'kilosort_cluster_01',
    '1': 'kilosort_cluster_02',
    '2': 'kilosort_cluster_03',
    '3': 'kilosort_cluster_04'
}

# Or cell-type based naming
celltype_map = {
    '0': 'pyramidal_neuron_1',
    '1': 'interneuron_1',
    '2': 'pyramidal_neuron_2',
    '3': 'unclassified_1'
}

Adding Custom Properties to the Units Table#

When using the canonical Units table approach, you may want to add additional columns that provide important context about your units. This is particularly useful when combining units from multiple probes or sorting algorithms. You can add custom properties using the sorting extractor’s set_property method. Note that if the sorting extractor already pre-loads properties those will be automatically added to the units table.

Adding Probe Information#

Here’s how to add a “probe” column to distinguish units from different probes:

from neuroconv.tools.testing.mock_interfaces import MockSortingInterface

# Create two sorting interfaces representing different probes
probe1_sorting = MockSortingInterface(num_units=4)
probe2_sorting = MockSortingInterface(num_units=3)

# Rename units to avoid conflicts (do this first)
probe1_sorting.rename_unit_ids({
    '0': 'a',
    '1': 'b',
    '2': 'c',
    '3': 'd',
})

probe2_sorting.rename_unit_ids({
    '0': 'e',
    '1': 'f',
    '2': 'g',
})

# Add probe information as a property for each sorting
probe1_sorting.sorting_extractor.set_property(
    key="probe",
    values=["probe_A"] * 4,  # All 4 units are from probe A
    ids=["a", "b", "c", "d"]
)

probe2_sorting.sorting_extractor.set_property(
    key="probe",
    values=["probe_B"] * 3,  # All 3 units are from probe B
    ids=["e", "f", "g"]  # Use renamed IDs
)

# Create NWB file and add both sortings
nwbfile = probe1_sorting.create_nwbfile()
probe2_sorting.add_to_nwbfile(nwbfile=nwbfile)

# Verify the probe column was added
units_df = nwbfile.units.to_dataframe()
print("Units table with probe information:")
print(units_df[['unit_name', 'probe']])

Expected output:

Units table with probe information:
id  unit_name    probe
0          a  probe_A
1          b  probe_A
2          c  probe_A
3          d  probe_A
4          e  probe_B
5          f  probe_B
6          g  probe_B

Adding Algorithm Provenance Information#

You can also add information about which sorting algorithm was used:

# Create sorting interfaces for different algorithms
kilosort_sorting = MockSortingInterface(num_units=3)
mountainsort_sorting = MockSortingInterface(num_units=2)

# Rename units to avoid conflicts (do this first)
kilosort_sorting.rename_unit_ids({
    '0': 'a',
    '1': 'b',
    '2': 'c',
})

mountainsort_sorting.rename_unit_ids({
    '0': 'd',
    '1': 'e',
})

# Add algorithm information
kilosort_sorting.sorting_extractor.set_property(
    key="algorithm",
    values=["kilosort"] * 3,
    ids=["a", "b", "c"]
)

mountainsort_sorting.sorting_extractor.set_property(
    key="algorithm",
    values=["mountainsort"] * 2,
    ids=["d", "e"]  # Use renamed IDs
)

# You can add multiple properties at once
kilosort_sorting.sorting_extractor.set_property(
    key="quality_score",
    values=[0.95, 0.87, 0.92],
    ids=["a", "b", "c"]
)

mountainsort_sorting.sorting_extractor.set_property(
    key="quality_score",
    values=[0.89, 0.76],
    ids=["d", "e"]  # Use renamed IDs
)

# Create NWB file with both sortings
nwbfile = kilosort_sorting.create_nwbfile()
mountainsort_sorting.add_to_nwbfile(nwbfile=nwbfile)

# View the enriched units table
units_df = nwbfile.units.to_dataframe()
print("Units table with algorithm and quality information:")
print(units_df[['unit_name', 'algorithm', 'quality_score']])

Expected output:

Units table with algorithm and quality information:
id  unit_name      algorithm  quality_score
0       a       kilosort           0.95
1       b       kilosort           0.87
2       c       kilosort           0.92
3       d   mountainsort           0.89
4       e   mountainsort           0.76