AI Agents

iDML: Incentivized Decentralized Machine Learning

Abstract

With the rising emergence of decentralized and opportunistic approaches to machine learning, end devices are increasingly tasked with training deep learning models on-devices using crowd-sourced data that they collect themselves. These approaches are desirable from a resource consumption perspective and also from a privacy preservation perspective. When the devices benefit directly from the trained models, the incentives are implicit. However, explicit incentive mechanisms must be provided when end-user devices are asked to contribute their resources (e.g., computation, communication, and data) to a task performed primarily for the benefit of others. This paper proposes a novel blockchain-based incentive mechanism for completely decentralized and opportunistic learning architectures. A smart contract is leveraged not only for providing explicit incentives to end devices to participate in decentralized learning but also to create a fully decentralized mechanism to inspect and reflect on the behavior of the learning architecture.

Introduction

Machine learning has been heavily applied to various user-facing tasks such as photo classification, input prediction, and smart home systems. Achieving high accuracy for these tasks requires massive training data, but privacy concerns and limited communication bandwidth between end-user devices and the cloud make gathering sufficient data challenging. While federated learning addresses some of these concerns through distributed training with a central coordinator, some contexts require completely decentralized opportunistic learning, where the learning process progresses only through opportunistic device-to-device encounters. Two styles of decentralized learning are considered: gossip-based learning, where devices exchange model snapshots in a merge-update-send cycle to learn a shared global model, and opportunistic collaborative learning (OppCL), where each device learns its own personalized model through send-train-return cycles with encountered neighbors. A significant challenge is incentivizing devices to contribute their local resources for the greater good, which is the main problem addressed by iDML.

System Design

The iDML system defines three roles: (1) the Learner, the participating device that seeks to benefit from a particular encounter; (2) the Neighbor, the device that contributes resources so the learner can learn; and (3) the Validators, participants who validate the neighbor's contribution. A device can take on more than one role at different times. The smart contract implementation is kept separate from the core decentralized learning functionality, making it portable and generalizable. An algorithm-specific adaptor choreographs interactions between iDML and the decentralized learning implementation. The system workflow operates through a nine-step process centered around the smart contract:

Initialization: Participants exchange digital currencies with tokens via an ERC-20 token standard
Pre-Training Check: Learner and Neighbor check in with the smart contract, with prepayment authorized
Learning: The decentralized learning algorithm executes without smart contract interaction
Learning Complete: Results are intercepted by the iDML Adapter, with MD5 hashes reported to the contract
Validation: Validators assess the updated model relative to the Learner's previous model
Voting: Validators vote on whether the Neighbor's contribution was positive
Result Check: Finalization is triggered when voting threshold $V$ is met or max voting time passes
Award Distribution: The smart contract distributes rewards or applies penalties

Malicious Model Prevention

iDML addresses two significant classes of attacks: incentive mechanism attacks (where attackers aim to gain rewards without meaningful participation) and learning attacks (poisoning attacks that seek to cause the model to misclassify inputs). For incentive mechanism attacks, the staking requirement in Step 0 provides a mechanism to penalize dishonest participants. The reward/penalty structure ensures that the penalty an attacker may receive is equal to or slightly larger than what they may earn, making the mathematical expectation of the incentive mechanism attack equal to or less than 0. For learning attacks, the staking and validation steps are effective at preventing such attacks, as dishonest Neighbor participants face penalties applied to their staked tokens.

Reward and Penalty Distribution

The reward distribution follows five cases based on validator votes. When all validators agree the contribution is valid ( $n_v^t \neq 0; n_v^f = 0$ ), the neighbor receives $r_n = p_r^n * r$ and validators share $r_v^t = \frac{p_r^v * r}{n_v^t}$ . When validators unanimously reject ( $n_v^t = 0; n_v^f \neq 0$ ), the prepayment returns to the Learner ( $r_n = -r$ ) and validators who voted no share a portion ( $r_v^f = \frac{r}{n_v^f}$ ). Mixed voting scenarios distribute rewards proportionally, with configurable parameters $p_r^n$ and $p_r^v$ (where $p_r^v + p_r^n = 1$ ) per decentralized learning algorithm. The tolerance parameter $\tau$ defines the lower bound of trained model accuracy, allowing for expected fluctuations while preventing manipulation.

Evaluations

The system is evaluated with two state-of-the-art decentralized learning algorithms: gossip learning and OppCL, on CIFAR-10 (60,000 32x32 color images, 10 classes) and Fashion-MNIST (70,000 28x28 grayscale images, 10 classes) under both IID and Non-IID data distributions. The smart contract is implemented in Solidity and deployed on Ganache. For gossip learning, 10 convolutional layers with max pooling and dropout are used; for OppCL, three CNN model sizes with 6, 10, and 14 layers are employed. Results with 50 simulated participants show that three validators provide stable voting accuracy across different settings. In IID settings, both learning algorithms show stable voting accuracy patterns. The system effectively identifies and marginalizes attackers: when 10 learning attackers (who flip labels) are injected, iDML initially follows a similar accuracy decline but recovers after approximately 40,000 blocks as attackers' stakes fall below the participation threshold and are eliminated from the system. The cost analysis shows a total gas cost of 457,225 per learning encounter, translating to approximately $10.02 on Ethereum or$ 0.03 on Polygon.

Conclusion

iDML presents a framework for incentivizing decentralized and opportunistic machine learning. The evaluation results demonstrate that the system can effectively compensate neighboring devices that provide communication and computational resources to assist other participants in model learning tasks. The system can detect attackers and avoid malicious models being merged into users' models. The evaluation across different decentralized learning algorithms with different data and data distributions shows the wide range of use cases. Future work includes evaluating the system with large-scale real-world contact patterns and considering data item rarity for more nuanced incentive mechanisms.

Stay Informed

Learn about company and product updates, upcoming events, rewards, and more.