Metrics
neureptrace.metrics
brier_score_multiclass(probabilities, labels)
Compute multiclass Brier score using one-hot targets.
Source code in src/neureptrace/metrics/__init__.py
179 180 181 182 183 184 185 186 | |
compare_prepost_windows(frame, metric_column, pre_window, post_window, time_column='time', group_columns=())
Compare a metric between inclusive pre and post time windows.
Source code in src/neureptrace/metrics/prepost.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
confusion_category_enrichment(frame, *, metadata_frame, true_column='true_label', predicted_column='predicted_label', category_columns=None, group_columns=(), participant_column=None, metadata_label_columns=DEFAULT_METADATA_LABEL_COLUMNS, n_permutations=10000, seed=0)
Test whether off-diagonal errors stay within label metadata categories.
Source code in src/neureptrace/metrics/confusion.py
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 | |
confusion_category_matrix(frame, *, metadata_frame, true_column='true_label', predicted_column='predicted_label', category_columns=None, group_columns=(), participant_column=None, metadata_label_columns=DEFAULT_METADATA_LABEL_COLUMNS)
Summarize directional category-to-category error counts and lifts.
Source code in src/neureptrace/metrics/confusion.py
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 | |
confusion_counts(frame, true_column='true_label', predicted_column='predicted_label', group_columns=())
Count true/predicted label pairs in a trial-level prediction table.
Source code in src/neureptrace/metrics/confusion.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | |
confusion_pair_summary(frame, true_column='true_label', predicted_column='predicted_label', group_columns=(), participant_column=None, metadata_frame=None, metadata_label_columns=DEFAULT_METADATA_LABEL_COLUMNS, label_prefix='label')
Summarize off-diagonal errors as unordered, bidirectional label pairs.
Expected counts preserve the true-label and predicted-label error marginals.
Metadata columns, when supplied, are copied for both labels and get an
additional same_<metadata_column> flag when both sides are known.
Source code in src/neureptrace/metrics/confusion.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
expected_calibration_error(probabilities, labels, *, n_bins=10)
Compute top-label expected calibration error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
probabilities
|
ndarray
|
Array of shape |
required |
labels
|
ndarray
|
Integer class labels of shape |
required |
n_bins
|
int
|
Number of equally spaced confidence bins. |
10
|
Source code in src/neureptrace/metrics/__init__.py
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
negative_log_likelihood(probabilities, labels, *, eps=1e-15)
Compute mean categorical negative log-likelihood from probabilities.
Source code in src/neureptrace/metrics/__init__.py
189 190 191 192 193 194 195 196 197 198 | |
per_class_accuracy(frame, true_column='true_label', predicted_column='predicted_label', participant_column=None, group_columns=())
Summarize one-vs-rest recall/accuracy for each true class.
Source code in src/neureptrace/metrics/confusion.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |
rank_class_scores(scores, classes, y_true, *, top_k=(2, 3), row_top_k=3, class_column='class')
Rank true labels in a per-class score matrix and compute top-k metrics.
Missing true labels are counted as top-k failures but are excluded from the
finite mean/median rank. If no class-score columns are available, top-k and
rank summaries are undefined and returned as NaN.
Source code in src/neureptrace/metrics/ranking.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | |
reliability_bins(probabilities, labels, *, n_bins=10)
Summarize top-label reliability bins for calibration plots.
Source code in src/neureptrace/metrics/__init__.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | |
summarize_window_metric(frame, metric_column, window, time_column='time', group_columns=())
Summarize one metric inside an inclusive time window.
Source code in src/neureptrace/metrics/prepost.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
top_k_accuracy(probabilities, labels, *, k=1)
Compute top-k classification accuracy from probability rows.
Source code in src/neureptrace/metrics/__init__.py
201 202 203 204 205 206 207 208 209 210 211 212 | |
validate_probability_inputs(probabilities, labels=None, *, require_normalized=True, normalization_atol=1e-06)
Validate and coerce probability-matrix inputs used by scoring metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
probabilities
|
ndarray
|
Array-like object with shape |
required |
labels
|
ndarray | None
|
Optional integer class labels of shape |
None
|
require_normalized
|
bool
|
If true, each probability row must sum to one within
|
True
|
normalization_atol
|
float
|
Absolute tolerance for row-sum checks. |
1e-06
|
Source code in src/neureptrace/metrics/__init__.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | |
validate_sample_weight(sample_weight, n_samples)
Return validated non-negative per-sample weights.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample_weight
|
Iterable[float] | ndarray
|
One-dimensional non-negative weights. |
required |
n_samples
|
int
|
Expected number of samples. |
required |
Source code in src/neureptrace/metrics/weighted.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
weighted_brier_score_multiclass(probabilities, labels, sample_weight)
Compute a weighted multiclass Brier score using one-hot targets.
Source code in src/neureptrace/metrics/weighted.py
72 73 74 75 76 77 78 79 80 81 82 83 84 | |
weighted_expected_calibration_error(probabilities, labels, sample_weight, *, n_bins=10)
Compute weighted top-label expected calibration error.
Source code in src/neureptrace/metrics/weighted.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | |
weighted_negative_log_likelihood(probabilities, labels, sample_weight, *, eps=1e-15)
Compute weighted mean categorical negative log-likelihood.
Source code in src/neureptrace/metrics/weighted.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | |
weighted_reliability_bins(probabilities, labels, sample_weight, *, n_bins=10)
Summarize weighted top-label reliability bins for calibration plots.
The returned rows keep the unweighted reliability_bins schema and add
sample_weight plus sample_weight_fraction so downstream reports can
display both raw-bin occupancy and the effective contribution of each bin.
Source code in src/neureptrace/metrics/weighted.py
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |
weighted_top_k_accuracy(probabilities, labels, sample_weight, *, k=1)
Compute weighted top-k classification accuracy.
Source code in src/neureptrace/metrics/weighted.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | |
Pre/Post Windows
neureptrace.metrics.prepost
compare_prepost_windows(frame, metric_column, pre_window, post_window, time_column='time', group_columns=())
Compare a metric between inclusive pre and post time windows.
Source code in src/neureptrace/metrics/prepost.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
summarize_window_metric(frame, metric_column, window, time_column='time', group_columns=())
Summarize one metric inside an inclusive time window.
Source code in src/neureptrace/metrics/prepost.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
Confusion Tables
neureptrace.metrics.confusion
confusion_category_enrichment(frame, *, metadata_frame, true_column='true_label', predicted_column='predicted_label', category_columns=None, group_columns=(), participant_column=None, metadata_label_columns=DEFAULT_METADATA_LABEL_COLUMNS, n_permutations=10000, seed=0)
Test whether off-diagonal errors stay within label metadata categories.
Source code in src/neureptrace/metrics/confusion.py
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 | |
confusion_category_matrix(frame, *, metadata_frame, true_column='true_label', predicted_column='predicted_label', category_columns=None, group_columns=(), participant_column=None, metadata_label_columns=DEFAULT_METADATA_LABEL_COLUMNS)
Summarize directional category-to-category error counts and lifts.
Source code in src/neureptrace/metrics/confusion.py
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 | |
confusion_counts(frame, true_column='true_label', predicted_column='predicted_label', group_columns=())
Count true/predicted label pairs in a trial-level prediction table.
Source code in src/neureptrace/metrics/confusion.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | |
confusion_pair_summary(frame, true_column='true_label', predicted_column='predicted_label', group_columns=(), participant_column=None, metadata_frame=None, metadata_label_columns=DEFAULT_METADATA_LABEL_COLUMNS, label_prefix='label')
Summarize off-diagonal errors as unordered, bidirectional label pairs.
Expected counts preserve the true-label and predicted-label error marginals.
Metadata columns, when supplied, are copied for both labels and get an
additional same_<metadata_column> flag when both sides are known.
Source code in src/neureptrace/metrics/confusion.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
per_class_accuracy(frame, true_column='true_label', predicted_column='predicted_label', participant_column=None, group_columns=())
Summarize one-vs-rest recall/accuracy for each true class.
Source code in src/neureptrace/metrics/confusion.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |