July 2025 – Mike Stanley

I’ll be honest – MinIO wasn’t initially at the top of my list of must-see presentations at Cloud Field Day 23. But recent conversations in the IT community around their decision to remove the web-based management UI from their Community Edition had piqued my interest. The move generated quite a bit of discussion about open source sustainability and commercial strategies, and I was curious to hear their side of the story.

My interest was also personal. A developer I knew indirectly had been working on an interesting proof of concept using MinIO to store and serve 3D models generated from scans of animal cadavers and organs for veterinary education. The project could require massive storage capacity for detailed anatomical models, and there was also a desire to pivot from using a SQL Server database to store smaller objects. It was exactly the kind of use case that showcases why object storage matters beyond simple file archiving – and why performance and scalability decisions have real-world implications for research and education.

What I got instead was a deep dive into AIStor, MinIO’s commercial offering that represents their evolution from a simple S3-compatible storage solution into what they’re positioning as a comprehensive AI data platform. AB Periasamy, Jason Nadeau, and Dil Radhakrishnan walked us through AIStor, their commercial offering designed specifically for AI and analytics workloads, complete with features I hadn’t expected to see from a storage vendor.

Here are the four key takeaways that stood out to me:

1. Object-Native vs. Gateway Storage: Why Architecture Matters for AI Workloads

Not gonna lie – when I first heard MinIO’s Jason Nadeau talk about “object-native architecture,” my initial reaction was “here we go with another vendor trying to differentiate their storage with fancy terminology.” But as he walked through the comparison between their approach and traditional object gateway solutions, it started making a lot more sense, especially for anyone who’s spent time dealing with the performance headaches that come from bolting new capabilities onto existing infrastructure.

The reality is many enterprise environments have been down this road before. Legacy SAN and NAS systems get extended and retrofitted for years because ripping and replacing storage infrastructure isn’t exactly a trivial decision. But what MinIO demonstrated is why that approach fundamentally doesn’t work when you’re talking about AI workloads that need to move massive amounts of data quickly and consistently. Their gateway-free, stateless, direct-attached architecture eliminates the translation layers that create bottlenecks – and anyone who’s ever tried to troubleshoot performance issues through multiple abstraction layers knows exactly what I’m talking about.

What makes this architectural difference even more compelling is how it enables features like PromptObject – AIStor’s ability to query unstructured data directly through the S3 API using natural language prompts. During Dil Radhakrishnan’s demo, you could literally ask a PDF or image to return structured JSON data without building complex RAG pipelines or maintaining separate vector databases. For known single-object queries, PromptObject removes the need for those components entirely—but it can also complement a RAG pipeline when broader inference or contextual chaining is required.

When AB Periasamy talked about deployments with more than 60,000 drives across multiple racks, all needing atomic operations across multiple drives simultaneously, it hit home why traditional storage architectures break down. AI training and inference demand a level of performance and consistency that wasn’t even on the radar when most current storage infrastructure was designed. And increasingly, they also demand the kind of intelligent interaction with data that PromptObject represents – turning storage from a passive repository into an active participant in AI workflows.

MinIO also demonstrated something called the Model Context Protocol (MCP) – which, frankly, sounds like yet another acronym to keep track of, but actually does something useful. It’s Anthropic’s spec that MinIO has adopted to let AI agents talk directly to storage systems. So instead of pulling data out, processing it somewhere else, and shoving it back, an AI agent can just ask MinIO to list buckets, tag objects, or even build dashboards on the fly. It’s the kind of direct integration that makes sense once you see it in action, even if the name makes it sound more complicated than it needs to be.

2. S3 Express API: What Amazon Learned About AI Storage Performance

AB Periasamy’s explanation of S3 Express was particularly interesting. Amazon’s decision to strip away certain features from their general-purpose API to optimize for AI workloads reveals where the real performance bottlenecks live.

The changes Amazon made tell a story about practical performance optimization. Getting rid of MD5 sum computations makes perfect sense – anyone who’s dealt with large file transfers knows that checksum calculation can be a significant CPU hit, especially when you’re talking about the massive datasets AI workloads require. Same goes for eliminating directory sorting on list operations. When you’re dealing with billions of objects, sorting is just a waste of compute resources that AI applications don’t actually need.

What’s particularly interesting from an enterprise IT perspective is that MinIO implemented S3 Express compatibility in AIStor, giving you the choice between regular S3 API and S3 Express without requiring any data format changes. You can literally restart the server and switch between APIs. That kind of flexibility is exactly what organizations need when they’re constantly balancing performance requirements with operational simplicity and budget constraints.

3. GPU Direct Storage: Why Your CPU is the New Bottleneck

Here’s something that really made me rethink how modern compute infrastructure should be architected: AB’s explanation of how GPUs have become the main processor and CPUs have essentially become co-processors for AI workloads. For those of us who’ve spent years optimizing CPU and memory utilization, this represents a significant architectural shift.

The bottleneck isn’t the GPU processing power – it’s how fast you can get data to the GPU memory. Traditional architectures require data to flow from storage through the CPU and system memory before reaching the GPU, creating a chokepoint that limits the performance of expensive GPU hardware. GPU Direct Storage bypasses all that by using RDMA to move data directly from storage to GPU memory, with HTTP as the control plane and RDMA as the data channel.

What caught my attention during the Q&A was the practical implementation details. You need Mellanox ConnectX-5 or newer network cards, and there are real trade-offs around encryption (you basically lose the RDMA performance benefits if you need to decrypt on the client side). These are the kinds of infrastructure requirements that need to be planned for now if organizations are serious about supporting AI workloads. The performance gains are significant, but you’re looking at specific hardware requirements and architectural decisions that affect entire network fabrics.

4. From 30PB to 50PB Overnight: Scaling Storage for AI at Enterprise Scale

One of the most eye-opening parts of the presentation was hearing about real customer deployments – like the fintech client that scales from 30 petabytes to 50 petabytes based on market volatility, or the autonomous vehicle manufacturer storing over an exabyte of data. These aren’t theoretical use cases; these are production environments dealing with the kind of explosive data growth that keeps storage administrators up at night (and honestly, makes me grateful for our more modest data growth challenges).

What really resonated was the discussion around failure planning. MinIO built AIStor with erasure coding parity levels of eight, assuming your hardware will break and plans accordingly. In environments where equipment often runs longer than ideal due to budget constraints (I once maintained a set of IBM servers nearly a decade past their initial warranty), this kind of resilience planning is crucial. When you’re talking about exabyte-scale deployments, hardware failure isn’t a possibility – it’s a constant reality.

The implications for higher education are significant. Research institutions are increasingly dealing with AI and machine learning workloads that generate massive datasets. The traditional approach of scaling up conventional storage solutions isn’t going to cut it when a single research project can generate petabytes of data. Organizations need to start thinking about storage infrastructure that’s designed from the ground up for these workloads, not retrofitted to handle them.

Final Thoughts

What struck me most about MinIO’s presentation was AB Periasamy’s technical candor and depth of knowledge. This was my second experience at a Tech Field Day event where I found myself genuinely impressed by a CEO’s ability to dive into the technical weeds and provide substantive answers to challenging delegate questions. AB didn’t shy away from discussing the limitations and trade-offs of their approach – whether it was acknowledging the encryption challenges with GPU Direct Storage or explaining why certain hardware requirements are non-negotiable.

The removal of the Community Edition GUI, which initially brought MinIO to my attention for this event, makes more sense in the context of their broader strategy. They’re clearly betting that the future of storage isn’t about pretty management interfaces, but about APIs, automation, and intelligent data interaction. Whether that bet pays off remains to be seen, but their technical approach to solving real AI infrastructure challenges is compelling.

For organizations serious about AI workloads, MinIO’s AIStor represents a thoughtful approach to the storage infrastructure challenges that traditional vendors are still trying to solve by bolting AI capabilities onto legacy architectures. The question isn’t whether AI will transform how we think about storage – it’s whether we’ll build infrastructure designed for that transformation, or continue retrofitting solutions that were never meant for these workloads.

To watch all the videos of MinIO’s presentations at Cloud Field Day 23, head over to Tech Field Day’s site.

Enterprise storage vendors love to talk about exabyte scale, AI readiness, and multi-cloud vision. But as someone who’s spent the better part of 30 years in operations—across SANs, NAS, cloud, and everything in between—I tend to filter that hype through a much simpler lens:

“Will this thing work when it matters?”

Scality presented at Cloud Field Day 23, and what stood out wasn’t just the scale of their deployments—it was the quiet operational sanity behind them. Sure, there was impressive performance and architectural flexibility, but what really caught my attention were five practical takeaways that don’t always make it into press releases or analyst write-ups.

Much of that clarity came from Scality CTO and co-founder Giorgio Regni, who didn’t just walk through architecture slides—he gave us a window into how RING behaves in production, with customers operating at truly massive scale.

Let’s dig into the details.

IT Admins Feel at Home—Because RING Admins Like AWS

I’ve seen my share of S3-compatible storage systems over the years. Most of them focus on API compatibility, but few actually try to feel like AWS when you’re managing them. Scality RING does.

When they say “you can administer RING like AWS,” they mean it. IAM, users, policies, roles—it’s all modeled on the AWS way of thinking. That means if your ops team knows how to manage buckets in AWS, they’re already 80% of the way toward managing a RING deployment. No translation layer required. No re-education. Just clean, intuitive administrative patterns that make sense at scale.

This isn’t an accident. Giorgio specifically emphasized how important this was to their design. Multi-tenancy, usage tracking, and S3-compatible policy enforcement were all baked into the system because, as he put it, “Our customers want to build their customers—internal or external.”

That sounds like cloud, because it is.

CORE5: A Marketing Name Worth Keeping

Scality doesn’t just do object lock, encryption, and erasure coding—they’ve bundled these and other operational guardrails into a framework they call CORE5. I normally roll my eyes at product naming exercises, but this one stuck.

CORE5, as Scality frames it, covers five distinct layers of cyber resilience:

API-level resilience – S3 Object Lock is enforced at the moment of object creation, ensuring data is immutable and protected from ransomware or accidental deletion.
Data-level resilience – Fine-grained IAM controls, zero-trust architecture, and AES-256 encryption help prevent unauthorized access or exfiltration—even at scale.
Storage-level resilience – Distributed erasure coding slices and scatters data across nodes, making it indecipherable to attackers—even if they gain root access.
Geographic resilience – Multi-site replication ensures data survives regional disasters or breaches without sacrificing availability.
Architectural resilience – The platform is fundamentally immutable; even with elevated privileges, it resists overwrites and tampering by design.

It’s a checklist that speaks directly to storage and security teams—people who live in the world of audits, recovery points, and “what if” drills.

Giorgio didn’t spend long on branding, but the features behind CORE5 came up repeatedly in his examples—especially immutability, replication, and the ability to absorb drive failures without blinking. In his words: “At least 10 drives fail every day across our customer base. The system doesn’t care. It just heals.”

That’s the kind of design mindset that shows respect for ops teams.

75% Cost Reduction Isn’t Just About Hardware

Cost reduction stories are everywhere, but the 75% figure quoted in the Euro bank deployment wasn’t just about cheap disks or high density. It was the cumulative result of architectural and operational choices:

Immutable buckets replace complex backup regimes.
Lifecycle policies eliminate cold data hoarding.
Disaggregated scaling means you don’t have to oversize any one tier.

That’s the kind of systems thinking I appreciate—not “buy fewer drives,” but “manage data better across its lifespan.” And yes, that includes offloading to tape without your end users ever noticing.

Giorgio added useful context here too: their customers don’t just build for scale—they evolve into it. One bank started with just 1PB and now manages 100PB across six global regions. That’s not a forklift upgrade. That’s operational confidence.

Cold Data Still Matters—Just Don’t Make Ops Pay For It

One of my favorite things about the CNES space agency deployment was that it brought tape back without apology. Scality’s RING integrates with HSM partners (like HP, Atempo, IBM) via an open API called TLP, letting archived data move off to tape while leaving metadata stubs behind in RING.

The result: your apps still talk S3, but your infrastructure quietly shifts that 15-year-old satellite image from spinning disk to tape. The operations team isn’t stuck managing separate namespaces, writing custom scripts, or reverse-engineering archival policies. It’s transparent, policy-driven, and reversible if needed.

Giorgio’s team even replicated AWS’s Glacier-style retrieval behavior—right down to pending status and asynchronous notifications. But unlike Glacier, there’s no egress charge or hidden penalty. It’s your data, on your terms.

This isn’t just “S3-compatible.” It’s cold storage without cognitive overhead.

Reliability, Defined in Ops Terms: 5 Years Between Failures Over 5 Minutes

MTBF numbers are usually abstract, but Scality’s framing was refreshingly direct: across their customer base, the average time between service interruptions lasting more than five minutes is five years.

That’s not just marketing spin. That’s how you measure uptime when you’re responsible for real people using real applications in real-world systems. It’s not about failure-free hardware—it’s about failure-tolerant architecture. RING’s peer-to-peer foundation and automated healing clearly contribute, but more important is that the whole system seems designed to reduce drama, not just increase speed.

As Giorgio put it: “We get one event every five years. That’s the number I care about.”

In an age where we’re constantly told to expect failure and design around it, Scality is saying, “Sure. And also—we’ll try not to wake you up for it.”

Closing Thoughts

It’s easy to get swept up in the big numbers: trillions of objects, exabytes of data, petabytes per day of ingest. But what I took away from Scality’s presentation—especially Giorgio’s portion—is this:

Operational success at scale isn’t just about performance. It’s about predictability.

And that’s something RING seems to deliver—not just to hyperscalers, but to the banks, governments, and researchers quietly building the infrastructure behind the infrastructure.

It might not be sexy. But it works. And that still matters.

To watch all the videos of Scality’s presentations at Cloud Field Day 23, head over to Tech Field Day’s site.

Month: July 2025

MinIO at Cloud Field Day 23: Four Key Takeaways for Enterprise IT