RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility

Dawood Wasif1 Terrence J. Moore2 Jin-Hee Cho1
1Virginia Tech    2U.S. Army Research Laboratory
ICLR 2026
arXiv Code OpenReview Cite

(TL;DR) We propose RESFL, a federated learning framework that jointly optimizes privacy, group fairness, and detection utility via adversarial representation disentanglement and evidential uncertainty-guided aggregation — reducing membership-inference attack success by 37% and the equality-of-opportunity gap by 17% vs. FedAvg, while maintaining competitive mAP on FACET and CARLA.

Abstract

Federated Learning (FL) has gained prominence in machine learning across critical domains, enabling collaborative model training without centralized data aggregation. However, FL frameworks that protect privacy often sacrifice fairness: differential privacy reduces data leakage but hides sensitive attributes needed for bias correction, worsening performance gaps across demographic groups. This work explores the trade-off between privacy and fairness in FL-based object detection and introduces RESFL, an integrated solution jointly optimizing both.

RESFL incorporates adversarial privacy disentanglement and uncertainty-guided fairness-aware aggregation. The adversarial component uses a gradient reversal layer to remove sensitive attributes from learned representations. The uncertainty-aware aggregation employs an evidential neural network to weight client updates adaptively, prioritizing contributions with lower fairness disparities and higher confidence. We validate on high-stakes autonomous vehicle scenarios (FACET, CARLA) and non-visual benchmarks (Adult, TweetEval), confirming RESFL as a domain-agnostic foundation for responsible federated optimization.

37%
MIA Reduction
17%
Fairness Gap ↓
0.665
mAP on FACET
>90%
Label-Free Retention
Method

Overview

RESFL addresses two core challenges simultaneously: (i) preventing sensitive-attribute leakage during training, and (ii) mitigating bias in client updates. The framework integrates an adversarial privacy module and an evidential uncertainty head, both operating locally, with a fairness-aware server aggregation rule.

1

Feature Extraction

YOLOv8 backbone computes \(h_i = f(x;\,\theta_i)\) per client.

2

Gradient Reversal

GRL drives \(I(H;S) \!\to\! 0\), pushing attribute inference toward chance level.

3

UFM Computation

Dirichlet \(\alpha_0\) per group yields scale-invariant disparity score \(\mathrm{UFM}_i\).

4

Fair Aggregation

Server: \(\omega_i \propto \exp(-\beta\,\mathrm{UFM}_i)\) weights low-disparity clients higher.

RESFL local training framework
Figure 1 — Client-side local training pipeline. The feature extractor \(h_i = f(x;\phi_i)\) feeds both an Evidential Head \(E_\phi\) (for UFM computation) and Detection Head \(D_\phi\), plus a Gradient Reversal Layer driving adversarial privacy loss \(\mathcal{L}_\mathrm{adv}\). The composite local objective is \(\mathcal{L}_\mathrm{local} = \mathcal{L}_\mathrm{task} + \lambda_\mathrm{priv}\,\mathcal{L}_\mathrm{adv} + \lambda_\mathrm{fair}\,\mathcal{L}_\mathrm{uncertainty}\).

Uncertainty Fairness Metric (UFM)

Each client replaces its softmax detection head with an evidential output layer that predicts a non-negative concentration vector \(\boldsymbol{\alpha} = (\alpha_1,\ldots,\alpha_C)\) parameterizing a Dirichlet over class probabilities. The total evidence \(\alpha_0 = \sum_c \alpha_c\) gives a closed-form epistemic variance:

\[\sigma^2_{\mathrm{epi},c} \;=\; \frac{\alpha_c}{\alpha_0}\!\left(1 - \frac{\alpha_c}{\alpha_0}\right)\!\cdot\!\frac{1}{\alpha_0+1} \;\;\sim\;\; \frac{1}{\alpha_0}\]

Higher \(\alpha_0\) means lower uncertainty. Raw logits are passed through \(\alpha_c = 1 + \mathrm{softplus}(z_c)\) to ensure strict positivity. For group \(g\), the group-wise mean evidence aggregated over post-NMS detections \(\mathcal{P}_\tau(x)\) at threshold \(\tau\) is:

\[\bar{\alpha}_{0,g} \;=\; \mathbb{E}_{x \in \mathcal{D}_g}\!\left[\frac{1}{\max\!\left(1,\,|\mathcal{P}_\tau(x)|\right)}\sum_{d \,\in\, \mathcal{P}_\tau(x)} \alpha_0^{(d)}\right]\]

The Uncertainty Fairness Metric is the normalized inter-group epistemic gap:

\[\Delta_u = \max_g \frac{1}{\bar{\alpha}_{0,g}} - \min_g \frac{1}{\bar{\alpha}_{0,g}}, \qquad \mathrm{UFM} = \frac{\Delta_u}{\dfrac{1}{G}\displaystyle\sum_{g=1}^{G} \dfrac{1}{\bar{\alpha}_{0,g}} + \varepsilon}\]

UFM equals zero under perfect group parity and grows monotonically with disparity. It is scale-invariant and provably bounds the downstream fairness gaps \(|1-\mathrm{DI}|\) and \(\Delta\mathrm{EOP}\) (Appendix B, Thm. B.1 → Cor. B.5).

Adversarial Privacy Disentanglement

The feature extractor \(f(x;\theta)\) is augmented with an adversarial classifier \(A(h;\phi)\) predicting sensitive attribute \(s \in \{1,\ldots,K\}\) from representation \(h\). A Gradient Reversal Layer \(R_{\lambda}\) negates gradients by \(-\lambda_\mathrm{priv}\) during backpropagation, turning the local minimax into:

\[\min_{\theta}\;\max_{\phi}\;\mathbb{E}_{(x,s)\sim\mathcal{D}_i}\!\left[-\lambda_\mathrm{priv}\sum_{k=1}^{K} \mathbf{1}\{s=k\}\log A_k\!\left(R_\lambda\big(f(x;\theta)\big);\,\phi\right)\right]\]

By Fano's inequality, as \(I(H;S) \to 0\) the minimum achievable attribute-inference error satisfies \(P_e \to 1 - 1/K\) — chance level. The coefficient \(\lambda_\mathrm{priv}\) directly tunes the privacy–utility trade-off along a piecewise-convex frontier (Appendix C).

Joint Optimization and Aggregation

Each client minimises the composite local objective balancing detection, privacy, and fairness:

\[\mathcal{L}_\mathrm{local}(\theta,\phi) \;=\; \underbrace{\mathcal{L}_\mathrm{task}(\theta)}_{\text{detection}} \;+\; \underbrace{\lambda_\mathrm{priv}\,\mathcal{L}_\mathrm{adv}(\theta,\phi)}_{\text{privacy}} \;+\; \underbrace{\lambda_\mathrm{fair}\,\mathcal{L}_\mathrm{uncertainty}(\theta)}_{\text{fairness}}\]

After local updates, client \(i\) transmits \((\Delta\theta_i,\,\mathrm{UFM}_i)\) to the server. The server performs fairness-weighted global aggregation:

\[\omega_i = \frac{\exp(-\beta\,\mathrm{UFM}_i)}{\displaystyle\sum_{j=1}^{N} \exp(-\beta\,\mathrm{UFM}_j)}, \qquad \theta_G^{(t+1)} = \theta_G^{(t)} + \eta\sum_{i=1}^{N} \omega_i\,\Delta\theta_i\]

As \(\beta \to 0\) this recovers uniform FedAvg; larger \(\beta\) concentrates weight on clients with minimal inter-group uncertainty disparity. A deterministic gate (per-client validation mAP floor 0.30) blocks low-confidence clients from receiving high weight.

Algorithm 1 — RESFL Training with Adversarial Privacy and Uncertainty-Guided Aggregation
1
Input: global model \(\theta_G\), \(N\) clients \(\{\mathcal{D}_i\}\), \(\lambda_\mathrm{priv}=0.1\), \(\lambda_\mathrm{fair}=0.01\), \(\beta=2.0\), learning rates \(\eta,\,\eta_\phi\), rounds \(T=100\)
2
for \(t = 0 \to T-1\) do
3
 Server broadcasts \(\theta_G^{(t)}\) to all \(N\) clients
4
for each client \(i\) in parallel do
5
  Initialise \(\theta_i \leftarrow \theta_G^{(t)}\)
6
  for each local SGD step do
7
   \(\phi_i \;\leftarrow\; \phi_i - \eta_\phi\,\nabla_{\phi_i}\mathcal{L}_\mathrm{adv}(\theta_i,\phi_i)\)// adversary maximisation step
8
   \(\theta_i \;\leftarrow\; \theta_i - \eta\,\nabla_{\theta_i}\!\left[\mathcal{L}_\mathrm{task} + \lambda_\mathrm{priv}\mathcal{L}_\mathrm{adv} + \lambda_\mathrm{fair}\mathcal{L}_\mathrm{unc}\right]\)// GRL negates \(\nabla\mathcal{L}_\mathrm{adv}\) for \(\theta_i\)
9
  Compute \(\mathrm{UFM}_i\) via held-out local val split; set \(\Delta\theta_i = \theta_i - \theta_G^{(t)}\)
10
  Send \(\bigl(\Delta\theta_i,\;\mathrm{UFM}_i\bigr)\) to server
11
 Server computes \(\omega_i \propto \exp\!\bigl(-\beta\cdot\mathrm{UFM}_i\bigr)\)  // Eq. (5)
12
 \(\theta_G^{(t+1)} \;=\; \theta_G^{(t)} + \eta\,\displaystyle\sum_{i=1}^{N} \omega_i\,\Delta\theta_i\)  // Eq. (6) — fairness-weighted global update
13
Output: final global model \(\theta_G^{(T)}\)
Experimental Results

Setup

We evaluate in an autonomous vehicle context using FACET (32,000 real-world images, 50,000+ person instances annotated on the 10-level Monk Skin Tone scale, split into \(K=4\) IID clients) and CARLA (6,000 fine-tuning frames + 7,800 evaluation frames across 3 urban maps under 13 weather conditions). RESFL is implemented on a modified YOLOv8 backbone with an evidential Dirichlet head, trained for \(T=100\) communication rounds with \(\lambda_\mathrm{priv}=0.1\), \(\lambda_\mathrm{fair}=0.01\), \(\beta=2.0\), averaged over 3 seeds.

Key finding: RESFL is the only method that simultaneously achieves near-best performance across all four axes — utility, fairness, privacy, and robustness. No single baseline dominates across all metrics.
0.665
mAP (FACET)
Competitive vs. FairFed (0.701) at far superior privacy & fairness
0.196
\(\Delta\)EOP
Best equality-of-opportunity gap — 17% improvement over FedAvg baseline
0.209
Avg. Attack SR
Lowest combined MIA + AIA success rate across all compared methods

FACET Benchmark Comparison

Table 1: FACET comparison
Table 1. Comparison on FACET across utility (mAP \(\uparrow\)), fairness (\(|1-\mathrm{DI}|\,\downarrow\), \(\Delta\mathrm{EOP}\,\downarrow\)), privacy attacks (MIA SR \(\downarrow\), AIA SR \(\downarrow\)), and robustness attacks (BA AD \(\downarrow\), DPA EODD \(\downarrow\)). Mean ± std over 3 seeds. Bold = best per column. RESFL achieves the best or joint-best on 5 of 7 metrics.

Resilience Under Adverse Weather (CARLA)

We evaluate under cloud, rain, and fog at intensities 0–100% across 3 urban CARLA maps. Fog is the hardest condition as it removes scene structure unevenly — objects with darker appearance disappear first, amplifying group disparities. RESFL's evidence flooring, temperature control, and vacuity masking slow the growth of disparity and attack success under all conditions.

Figure 2: Weather robustness
Figure 2. Performance under cloud, rain, and fog at 0–100% intensity. Rows: mAP (\(\uparrow\)); Fairness Score — mean of \(|1-\mathrm{DI}|\) and \(\Delta\mathrm{EOP}\) (\(\downarrow\)); Privacy Score — mean of MIA and AIA success rates (\(\downarrow\)). RESFL (solid black) maintains the strongest overall profile. Shaded bands = ±1 std.
Figure 3: CARLA weather visualisations
Figure 3. CARLA simulation frames at 0%, 25%, 50%, 75%, and 100% intensity for cloud, rain, and fog. Fog uniquely eliminates scene structure entirely at high intensity; cloud and rain primarily affect contrast and texture. This asymmetric information loss drives the sharper fairness and privacy degradation seen in fog experiments.

Scalability: IID vs. Non-IID

Under non-IID client data (Dirichlet split, \(\alpha=0.5\), \(H_\mathrm{TV} \approx 0.33\)), RESFL maintains the best joint fairness–privacy profile with top mAP 0.538. Scaling to \(N=8\) clients under IID, RESFL achieves mAP 0.654 / fairness score 0.220 / privacy score 0.206, confirming linear computational scalability (communication cost \(\mathcal{O}(N|\theta|)\), matching FedAvg).

Generalization & Label-Free Variants

Cross-Domain Results

To confirm domain-agnosticism we evaluate on structured tabular data (Adult income, sensitive attribute: race) and text classification (TweetEval sentiment, sensitive attribute: gender) with \(K=4\) IID clients.

Adult Dataset

AlgorithmAcc ↑Fair ↓Priv ↓
FedAvg0.8530.3190.372
FedAvg-DP0.7060.3020.222
FairFed0.8450.2560.397
PUFFLE0.8290.2950.285
RESFL (ours)0.8480.2320.239

TweetEval Dataset

AlgorithmAcc ↑Fair ↓Priv ↓
FedAvg0.5260.0440.343
FedAvg-DP0.3720.0420.195
FairFed0.5310.0360.408
PUFFLE0.4960.0440.285
RESFL (ours)0.5070.0330.235

Attribute-Free UFM Variants

When sensitive labels are unavailable (regulated deployments), three label-free variants derive latent cohorts from model internals:

All three variants recover over 90% of the fairness and privacy improvements of labeled RESFL with less than 2% accuracy drop, confirming that uncertainty-guided aggregation captures latent heterogeneity without explicit demographic supervision.

Citation

If you find RESFL useful in your research, please cite:

@inproceedings{
wasif2026resfl,
title={{RESFL}: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility},
author={Dawood Wasif and Terrence J Moore and Jin-Hee Cho},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Wfz7gpoDSl}
}
Acknowledgements. This work was supported in part by the National Science Foundation under Award No. 2107450 and by the Army Research Office under Grant No. W911NF-24-2-0241.