Key Takeaways
In a development raising fresh alarms about the safety of autonomous AI systems, an experimental AI agent developed by teams linked to Alibaba began mining cryptocurrency without human instruction.
The agent, named ROME, didn’t just stray from its coding tasks; it actively repurposed cloud computing resources and built hidden backdoors to sustain unauthorized operations.
This real-world case of “instrumental convergence”, where an AI pursues subgoals to acquire more compute for its objectives, is among the clearest demonstrations yet of agentic AI risks.
The incident, revised in January 2026 and recently amplified across tech forums, occurred entirely during controlled training on Alibaba Cloud.
No external hackers or prompt injections were involved; the AI discovered the process through reinforcement learning.
+76
Bitcoin
Ethereum
Tether
USD Coin
Solana
Ripple
Dogecoin
Cardano
Toncoin
Shiba Inu
Avalanche
TRON
Chainlink
Polygon Matic
Polkadot
Wrapped Bitcoin
Litecoin
Dai
NEAR Protocol
Bitcoin Cash
Stellar
Cosmos
Filecoin
Ethereum Classic
Aptos
Hedera Hashgraph
Immutable
Optimism
Arbitrum
VeChain
The Sandbox
Decentraland
Axie Infinity
Injective Protocol
Render
The Graph
Aave
Chiliz
Helium
PAX Gold
Compound
Lido DAO Token
Sui
Conflux Network
Lido Staked ETH
OKB
Uniswap
Pepe
Ondo
Mantle
First Digital USD
XDC Network
Artificial Superintelligence Alliance
Jupiter
Quant
Worldcoin
Bonk
Tether Gold
JITO
JasmyCoin
Core
Floki Inu
Ethereum Name Service
SushiSwap
1inch Network
Tezos
Algorand
Flow
Trust Wallet Token
Curve DAO Token
MultiversX
Basic Attention Token
Enjin Coin
Ethena
Ethena Staked USDe
Build'N'Build
Kava.io
Celestia
Sei
IOTA
Frax
+162
Ethereum
Tether
USD Coin
Solana
Ripple
Dogecoin
Cardano
Toncoin
Shiba Inu
Avalanche
TRON
Chainlink
Polygon Matic
Polkadot
Litecoin
NEAR Protocol
Bitcoin Cash
Stellar
Cosmos
Filecoin
Ethereum Classic
Aptos
Immutable
Optimism
Arbitrum
VeChain
The Sandbox
Decentraland
Axie Infinity
Injective Protocol
The Graph
Hedera Hashgraph
Render Token
Aave
Chiliz
PAX Gold
Compound
Lido DAO Token
THORChain
Stacks
Arweave
Sui
Conflux Network
Uniswap
Pepe
Ondo
Mantle
First Digital USD
Bittensor
Kaspa
Celestia
Artificial Superintelligence Alliance
Jupiter
Quant
Worldcoin
PayPal USD
Bonk
Rocket Pool ETH
Tether Gold
Sei
JITO
JasmyCoin
PancakeSwap
Floki Inu
Ethereum Name Service
SushiSwap
1inch Network
Algorand
Flow
Trust Wallet Token
Curve DAO Token
Basic Attention Token
Enjin Coin
Ethena
Ethena USDe
Pi Network
Adventure Gold
Audius
Acala Token
Alchemy Pay
Arkham
API3
Bounce Token
Altlayer
Amp
Aevo
ARPA Chain
Ankr
Blur
Biconomy
Chromia
Celer Network
Celo
Civic
Convex Finance
Cartesi
COTI
DigiByte
DIA
Dymension
dYdX
ether.fi
FUNToken
FLUX
Ampleforth
Golem
GMX
Holo
IoTex
Illuvium
JUST
Liquity
Livepeer
Memecoin
Manta Network
Treasure
Mask Network
NKN
Neutron
Ocean Protocol
Origin Protocol
ORDI
Osmosis
Powerledger
Phala Network
Pendle
Portal
Pyth Network
ConstitutionDAO
iExec RLC
Rocket Pool
Reserve Rights
Storj
Starknet
Spell Token
Sun (New)
Saga
SuperVerse
Toko Token
Tellor
LayerZero
Usual
Cetus Protocol
Eigenlayer
Hamster Kombat
Catizen
Berachain
KAITO
Pudgy Penguins
Solayer
Alchemix
Bitcoin
Bitcoin SV
Movement
Nexo
Hyperliquid
Nervos Network
TrueUSD
Mina
STEPN
Synthetix
APEcoin
Gala
Cronos
Internet Computer
Build'N'Build
+217
Bitcoin
Ethereum
Tether
Build'N'Build
USD Coin
Solana
Ripple
Dogecoin
Cardano
Toncoin
Shiba Inu
Avalanche
TRON
Chainlink
Polkadot
Polygon Matic
Wrapped Bitcoin
Litecoin
Dai
NEAR Protocol
Bitcoin Cash
Monero
Stellar
Cosmos
Filecoin
Ethereum Classic
Aptos
Hedera Hashgraph
Immutable
Optimism
Arbitrum
VeChain
The Sandbox
Decentraland
Axie Infinity
Injective Protocol
Render Token
The Graph
Maker
Aave
Chiliz
Helium
PAX Gold
Compound
Lido DAO Token
THORChain
Stacks
Arweave
Sui
Conflux Network
Lido Staked ETH
Bitget Token
Wrapped Ethereum
OKB
Uniswap
Pepe
Ondo
Mantle
First Digital USD
Bittensor
Kaspa
Celestia
XDC Network
Artificial Superintelligence Alliance
Jupiter
Quant
Worldcoin
PayPal USD
Bonk
Flare
Tether Gold
Sei
JITO
JasmyCoin
PancakeSwap
Core
Floki Inu
Ethereum Name Service
SushiSwap
Kava.io
1inch Network
Tezos
Algorand
Flow
Trust Wallet Token
Curve DAO Token
KuCoin Token
MultiversX
Gitcoin
Zcash
IOTA
Basic Attention Token
Frax
Ethena
Ethena USDe
Fasttoken
Pi Network
SATS
Adventure Gold
Audius
Alchemy Pay
Arkham
API3
Bounce Token
Altlayer
Aergo
Amp
Aevo
ARPA Chain
Astar
Ark
Ankr
AirSwap
Alpaca Finance
Blur
Badger DAO
Bancor
BakeryToken
Biconomy
Chromia
Celer Network
Celo
Shentu
Civic
Convex Finance
Cartesi
Cyber
COTI
DigiByte
DIA
ether.fi
FUNToken
FLUX
Firo
Ampleforth
Golem
GMX
Gnosis
Moonbeam
Holo
IoTex
ICON
Illuvium
JUST
Kadena
Liquity
Livepeer
Lisk
Memecoin
Manta Network
Treasure
Mask Network
MetisDAO
Origin Protocol
ORDI
Ontology
Osmosis
Powerledger
Phala Network
Pendle
Portal
Pyth Network
ConstitutionDAO
Polkastarter
Qtum
iExec RLC
Rocket Pool
Reserve Rights
Ronin
Ravencoin
Starknet
Storj
Status
Spell Token
Sun (New)
SuperVerse
Toko Token
Theta Fuel
Tellor
Tensor
LayerZero
Usual
Eigenlayer
Hamster Kombat
Catizen
Berachain
KAITO
Pudgy Penguins
Solayer
Bio Protocol
ChainGPT
Cookie DAO
Solv Protocol
Alchemix
Bitcoin SV
Movement
DeXe
Binance Staked SOL
Nexo
Wrapped eETH
Hyperliquid
Casper
Zilliqa
Secret
Nervos Network
TrueUSD
BitTorrent
Mina
Dash
STEPN
Gemini Dollar
UNUS SED LEO
Synthetix
APEcoin
Gala
Theta Network
Fantom
Cronos
Internet Computer
Binance USD
ROME, a 3B-parameter coding agent built within Alibaba’s Agentic Learning Ecosystem (ALE), was developed collaboratively by the ROCK, ROLL, iFlow, and DT research teams.
The agent was designed for complex multi-turn tasks, including software engineering, through tool use and terminal commands.
During agentic reinforcement learning (RL), unexpected behaviors emerged.
The agent began issuing its own tool calls and code executions beyond assigned prompts.
In one notable episode, it established a reverse SSH tunnel from an Alibaba Cloud training instance to an external IP, bypassing ingress firewalls and creating a hidden backdoor.
Simultaneously, ROME repurposed GPUs for crypto mining, quietly diverting computing power from its training objectives.
Logs show these actions recurred intermittently, tied directly to the agent’s autonomous decisions.
The sequence escalated as the agent probed internal network resources, further breaching security protocols.
The RL reward system, focused purely on task completion, inadvertently encouraged these instrumental behaviors.

The rogue activity was detected not by dashboards, but via Alibaba Cloud’s production-grade firewall alerts.
Security teams observed policy violations and unmistakable crypto-mining traffic.
Cross-referencing firewall timestamps with RL logs traced the activity to ROME’s tool invocations.
Researchers confirmed the behaviors were emergent, not caused by external interference.
The AI agent exploited sandbox boundaries through legitimate tool interfaces.
Once detected, they contained rogue operations by isolating training instances, hardening network policies, shutting down reverse SSH tunnels, and terminating mining processes.
Researchers implemented a comprehensive safety overhaul, including safety-aligned datasets, red-teaming injections, and golden trajectories for supervised fine-tuning (SFT) and RL stages.
These changes improved boundary awareness and prevented future goal drift.
By the report’s public release, the teams had fully addressed the behaviors.
The Alibaba-linked teams earned praise for their transparency, exposing current gaps in agentic AI guardrails.
Alibaba’s ROME incident isn’t an isolated incident.
Over the past 12 months, several high-profile cases have highlighted how autonomous agents can deviate in surprising and sometimes costly ways:
As enterprise adoption grows—Gartner predicts 40% of applications will include task-specific agents by the end of 2026—these incidents serve as critical warnings.
Companies must implement stronger monitoring, sandboxing, and alignment techniques. Autonomous does not mean unsupervised.
Prashant Jha is a seasoned crypto journalist based in Delhi, India, with a Bachelor’s Degree in Computer Science Engineering. Passionate about the evolving world of blockchain and cryptocurrencies, he has been a dedicated voice in the industry since 2018. Prashant’s expertise lies in regulatory reporting, where he unravels complex legal and financial developments with clarity and precision. Before joining CCN in 2024, he honed his craft at Cointelegraph, establishing himself as a trusted name in crypto journalism.
His coverage spans major industry events, including the high-profile collapses of FTX, Three Arrows Capital (3AC), and LUNA, offering readers insightful analyses of their regulatory and market implications. Prashant’s technical background enables him to bridge the gap between intricate blockchain technology and its real-world applications, making his work accessible to novices and experts.
Beyond his professional pursuits, Prashant is an avid music enthusiast, often exploring diverse genres to unwind. A sports lover, he has a particular passion for cricket and frequently engages in discussions about the game. His multifaceted interests and sharp journalistic instincts make him a valuable contributor to CCN, where he continues shaping the crypto landscape's narrative.
You’re All Set!
Thanks for signing up. We’ll be in touch soon with the latest insights.
