Engineering the Invisible: How Shubham Malhotra Is Quietly Powering the Cloud

Update: 2023-12-13 20:38 IST

In an age where milliseconds matter and digital systems must run without interruption, engineers like Shubham Malhotra are quietly rewriting the playbook for scalable, fault-tolerant cloud infrastructure. While many developers focus on deployment, Shubham Malhotra, a software engineer working at Microsoft Azure, is going deeper - drawing from peer-reviewed academic research to influence real-world architectures.

“I never wanted to just be a coder,” Malhotra says. “I wanted to understand the backbone including the protocols, the failovers, the real-time data paths that keep the digital world running.”

This philosophy is reflected in two notable research papers he published during 2022 and 2023, both of which continue to shape his approach to building resilient systems in high-availability cloud environments.

Bridging Gaps in Optical Network Reliability

In 2022, Malhotra authored a paper titled “Optimizing Software Upgrades in Optical Transport Networks: Challenges and Best Practices”, published in Nanotechnology Perceptions. The research focused on zero-downtime deployment strategies for Optical Transport Networks (OTNs) which is the foundational layer of global telecom and data exchange.

“These aren’t just cables,” Malhotra explains. “They’re arteries of our global infrastructure. A small hiccup in software upgrades can trigger ripple effects across entire economies.”

His work proposed predictive resource provisioning, rollback-safe upgrades, and real-time traffic rerouting techniques covering strategies that were tested across multi-vendor lab setups. These techniques didn’t stay in academia. At Microsoft Azure, Malhotra adapted the same principles while leading the design of global CI/CD systems that auto-upgraded virtual machine extensions without disrupting customer workloads.

Graph Algorithms Meet the Real World

His recent publication, “Efficient Algorithms for Parallel Dynamic Graph Processing”, appeared in the International Journal of Communication Networks and Information Security. This work addressed a different but equally complex challenge: making sense of dynamic, ever-changing infrastructure.

“Most people don’t realize that modern cloud environments behave like living graphs,” says Malhotra. “Microservices interact, APIs evolve, logs span dependencies. It’s a huge dynamic map.”

His research introduced scalable parallel algorithms to update and traverse these dynamic graphs in real time. The implications for cloud computing were clear: better modeling of service dependencies, faster root-cause analysis during outages, and more intelligent system automation.

“These algorithms help not just with observability,” Malhotra notes, “but with resilience. You start seeing the failure paths before they occur.”

Engineering with Theory in Mind

Colleagues who’ve worked with Malhotra describe him as unusually systems-minded for someone still in his twenties. He attributes much of that to his decision to pursue research alongside full-time engineering roles.

“I don’t separate theory and practice,” he says. “Research gives me the mental models. Production gives me the reality check. It’s a feedback loop.”

While currently working at Microsoft, Malhotra is supporting large-scale Azure Arc workloads, some of which are used by public sector and enterprise customers worldwide.

The Path Ahead

He remains grounded in his mission: to build systems that don’t just scale, but adapt, recover, and evolve.

“I’m not just trying to reduce downtime,” he reflects. “I’m trying to design for the unknown i.e. systems that stay resilient even as everything around them changes.”

As infrastructure complexity continues to surge, voices like Malhotra’s are rooted in both research and real-world engineering further helping redefine what it means to be a modern software engineer with Fault tolerance and optimisation in mind.

Similar News