Discover how Sinkove accelerated medical imaging AI by leveraging Amazon SageMaker HyperPod, reducing costs, and improving training times through a scalable AWS infrastructure.
Customer Overview
Sinkove is at the forefront of medical imaging AI, specialising in the development of advanced diffusion models for both 2D and 3D medical imaging. Their mission is to transform diagnostic and treatment workflows by enabling more precise and data-rich visualisations through AI-driven approaches. As the healthcare industry increasingly turns to AI for efficiency and accuracy, Sinkove’s solutions are helping to redefine what’s possible in medical diagnostics.
Customer Challenge
As demand for their AI-powered imaging tools grew, Sinkove encountered growing pains in their infrastructure. Relying on a combination of in-house servers and academic GPU clusters, they quickly hit limitations in compute capacity, training speed, and cost efficiency. These issues impacted their ability to iterate on models quickly and meet delivery timelines for clients. Specifically, they faced:
- Limited access to high-performance GPUs, creating a bottleneck in training large-scale diffusion models
- High infrastructure and operational costs, especially with rising compute needs
- Delayed project timelines, as training cycles stretched over days or weeks
Scaling limitations, making it difficult to onboard new customers while maintaining model performance.
Partner Solution
To overcome these hurdles, Sinkove partnered with Cloud Combinator, an AWS Advanced Consulting Partner, to rearchitect their training infrastructure using scalable AWS solutions. The strategy included:
- Amazon SageMaker HyperPod: A managed environment with scalable, high-performance GPU clusters designed for complex training workloads. HyperPod enabled Sinkove to train models faster and with more predictable performance.
- Amazon EC2 P4d Instances: Featuring NVIDIA A100 Tensor Core GPUs, these instances were selected to power Sinkove’s most compute-intensive workloads with optimal throughput and cost-efficiency.
- AWS Credits Optimisation: Sinkove had existing AWS credits through the NVIDIA Inception program. Cloud Combinator worked closely with AWS to extend credit usage and ensure maximum cost efficiency, especially during the transition period.
This move to a fully cloud-based architecture helped Sinkove eliminate training bottlenecks and enabled seamless scaling to meet both current and future demand.
Results
The collaboration delivered a step change in operational performance:
- Training Time Reduction: Up to 40% faster model training, enabling quicker iteration cycles and faster time-to-market
- Cost Optimisation: Up to 50% savings on compute costs through better resource utilisation and credit strategy
- Time Efficiency: Weeks saved on infrastructure setup and training management through automation and cloud-native workflows
- Scalability: The new setup is designed to support the onboarding of 10–12 new customers over the next 12 months without any degradation in training performance or resource contention
By adopting AWS and working with Cloud Combinator, Sinkove not only resolved its immediate performance issues but also created a foundation for sustainable growth and innovation.
Future Outlook
With a scalable, cloud-native infrastructure now in place, Sinkove is well-positioned to expand its customer base and further develop its diffusion model capabilities. The company’s leadership plans to continue leveraging AWS technologies for ongoing innovation, confident that their infrastructure can handle increasing workloads without compromising speed, cost, or reliability.
The partnership with Cloud Combinator and AWS has not only solved Sinkove’s immediate compute challenges but also empowered the company to maintain its competitive edge in the rapidly evolving healthcare AI landscape.