Cloud storage cost optimization - image of a cloud with a file drawer full of folders

Cloud Storage Cost Optimization: 18 Expert Tactics (2026)

By: Oscar Moncada

Cloud operators, founders, and IT leaders share how they're managing cloud storage costs. From data-driven access audits to lifecycle automation to the forgotten capacity reservations hiding outside the standard cost report, here's what's working... and what's not.

Introduction: Why Cloud Storage Bills Keep Climbing

Cloud bills are climbing fast. Gartner projects $723 billion in worldwide public cloud spending, up from $596 billion just two years ago, and a reported 27% to 30% of that spend is wasted on resources nobody is actively using.

Storage is one of the worst offenders. Industry research suggests roughly 80% of corporate data has gone untouched for years, yet most of it still sits in the highest-cost performance tier by default. Companies that map their access patterns and re-tier accordingly typically uncover 20% to 40% in immediate storage savings. Aggressive cold-data offload programs can cut storage bills by up to 70%.

The math is simple; it's the execution is where teams trip up.

We asked operators, founders, and IT leaders who've actually done the work to walk us through their approach. The question: what's one way you've optimized storage costs in the cloud, and how did you identify which data could be moved to cheaper tiers?

The responses below cover AWS, Azure, and multi-cloud environments across genomics, medical imaging, voice AI, fleet management, insurance, and more. A few themes show up again and again: access patterns beat assumptions, automation is what makes savings stick, and the most expensive line items aren't always the ones you'd guess.

Here's what they shared.

The Forgotten QuickSight Reservation That Was Burning Budget

The storage cost win that surprised us most did not involve data tiering, but instead it came from a capacity reservation we had forgotten about.

We use AWS QuickSight for data analysis, and when we first configured it, we provisioned SPICE capacity based on the volume of data we planned to analyze at the time, which was the right call. Over the following months, however, we did a significant optimization of how we stored and structured our underlying data, reducing the footprint considerably. Because we never went back to revisit the SPICE reservation, we were still paying for the original allocation, and nothing in the standard cost report flagged it as a problem. The fix was straightforward: we reduced the reservation to match actual usage, and the savings were immediate.

The broader lesson is that capacity reservations for managed services deserve the regular review. Services like QuickSight SPICE and OpenSearch reserved nodes are easy to configure and then forget about. Building a recurring review of service-level capacity into your cloud operations cadence surfaces these gaps before they compound.

Oscar Moncada, Co-founder and CEO, Stratus10 & Kalos

Tagging GPU Training Data Hot, Warm, or Cold — And Letting It Age Itself

The first thing I did when storage costs started climbing at GpuPerHour was run an access frequency audit across every S3 bucket and persistent volume in our infrastructure. Most teams skip this step because they assume all stored data is actively needed, but the reality was eye-opening. Nearly 60 percent of our stored data, mostly old training checkpoints, completed job logs, and intermediate model artifacts, had not been touched in over 90 days.

I wrote a simple tagging script that tracked last-access timestamps and object sizes, then grouped everything into three categories: hot data accessed within 7 days, warm data accessed within 30 days, and cold data untouched for 90-plus days. That cold data category was the real opportunity. Training checkpoints from finished experiments, for example, were sitting in standard storage costing us four to five times more than necessary.

We moved cold data into Glacier Instant Retrieval for anything we might need occasionally, and Glacier Deep Archive for completed project artifacts we are required to retain but rarely access. The warm tier went into Infrequent Access storage. The entire migration cut our monthly storage bill by about 40 percent without affecting day-to-day operations at all.

The key insight was that identification has to be automated and continuous, not a one-time cleanup. We now run the access-frequency tagging script weekly and have lifecycle policies that automatically transition objects based on access patterns. Data that starts hot naturally ages into cheaper tiers without anyone needing to make manual decisions about where it belongs.

Faiz Ahmed, Founder, GpuPerHour

Options Chains Go Cold the Moment Contracts Expire

Options chain data goes cold the second a contract expires — and we were paying hot-tier prices on terabytes of dead positions. The trigger was an AWS bill that grew 30% in three months while query patterns hadn't changed.

What we did: any options chain whose contracts have all expired more than 90 days prior gets moved to S3 Infrequent Access. Anything older than 18 months drops to Glacier Instant Retrieval. The 90-day buffer matters because earnings reactions and post-event vol behavior still get queried for several weeks after expiry.

We identified the candidates by joining query logs against the chain expiry table — 87% of reads hit data younger than 90 days, but storage costs were dominated by chains 1+ years out. That mismatch was the entire optimization.

Net result: the storage line item dropped meaningfully without a single query path changing. No code rewrite, just lifecycle rules.

Aigars Pilmanis, Founder, VolRadar

In Genomics, Rehydrating the Wrong File Wipes Out Your Savings

I've spent 15+ years in computational biology and HPC, and at Lifebit we design cloud-agnostic Trusted Research Environments and data lakehouses for genomics and clinical data — so storage tiering is something I think about as both a technical and operating-cost problem.

One practical way we've optimized cloud storage costs is by separating “hot” analysis-ready data from “cold” archival data inside the data lakehouse. The data that powers active workflows, cohort building, OMOP harmonization, or real-time analytics stays in fast-access storage; raw inputs, older intermediate files, completed workflow outputs, and reproducible archives get moved to cheaper tiers.

The key is identifying data by access pattern and reuse value, not just file size. We look at what is touched repeatedly by researchers or pipelines versus what is mainly kept for compliance, reproducibility, or future reference; FAIR metadata, cataloging, and standardized data products make that much easier because you can actually see what is still discoverable and reused.

A lesson I'd share: don't tier blindly. In genomics especially, rehydrating the wrong files can wipe out your savings, so we pair cost reviews with workflow evidence — what's used by Nextflow pipelines, what supports current studies, and what can safely sit in low-cost storage without slowing the science.

Maria Chatzou Dunford, CEO & Founder, Lifebit

Mapping Predictable Workloads to Azure Blob Tiers and Reserved Instances

One way we optimize costs is by migrating data to Azure blob storage and utilizing Reserved Instances for predictable workloads. We identify which data to move by conducting cloud assessments that flag redundant services and analyze storage misconfigurations.

During a migration for Aurex, we moved corporate data to SharePoint Online and utilized Azure blob storage accounts with cognitive search. This allowed the client to transition legacy data to more affordable tiers while maintaining the necessary accessibility for their operations.

We specifically use Microsoft Azure's native tools to monitor data usage and automate the transition to cooler storage tiers. This ensures businesses only pay for the specific resources they are actively utilizing.

Orrin Klopper, CEO, Netsurit

De-Duplicating DICOM Series and Tiering by the Clinical Window

One way I optimized cloud storage costs was to de-duplicate DICOM series and implement automatic tiering that keeps recent, frequently accessed studies on fast SSDs during the clinical window and moves older studies to encrypted cold storage. We identified data to move by tracking access patterns and last-access timestamps within that defined clinical window and by grouping files by DICOM UID to eliminate duplicates. Those policies run automatically so files age into cheaper tiers without manual intervention. This approach reduces the volume of hot cloud storage while keeping clinically relevant data immediately available and compliant in long-term storage.

Andrei Blaj, Co-founder, Medicai

Start With the Orphans: Storage Nobody Owns but Everyone Pays For

One of the first things I do is look for storage that's no longer connected to anything but is still costing the client money every month. Most clients don't even know it's there. Getting rid of it is one of the easiest ways to lower a bill without changing how anything works.

I also check how long it's been since a file was last opened. Anything that hasn't been opened in months gets moved to a lower-cost storage plan. The client can still get to it if they need it, but it's no longer eating up budget for no reason.

Aaron Chichioco, IT Specialist, Partner Systems

Git-Like Branching to Separate Stable Data From Messy Legacy Batches

I implemented lakeFS on our S3 data lake to add Git-like branching and rollback to ML and production datasets. We identified legacy machine outputs — logs, sensor readings, and inconsistent batch metrics — as lower-value candidates because they were messy, siloed, and broke data lineage. Branching and testing those datasets without touching production baselines let us isolate which batches were stable and which were not. That isolation made it straightforward to flag the messy, infrequently accessed batches as candidates for cheaper storage tiers while keeping critical training and production data on the primary lake.

Matteo Valles, Owner, Vol Case

If a Human Hasn't Touched It in 90 Days, It Doesn't Belong in Hot Storage

We treat cloud storage like a basement. You just keep throwing boxes in there until you can't walk. We were paying top-tier prices for millions of old quote logs from 2022. Nobody was looking at them. It's dead weight. We didn't buy some fancy monitoring tool to find it. I just ran a simple query for anything not accessed in 90 days. If a human hasn't touched a file in three months, it doesn't belong in your “hot” bucket.

Move that junk to cold storage immediately. We shifted about 40% of our database into Glacier and our monthly bill dropped overnight. It's not about some complex data roadmap. It's about being a minimalist. And don't worry about “what if” scenarios. If you need that old data for an audit once every two years, you can wait four hours for it to retrieve. Stop paying for instant access to things you've forgotten even exist.

James Shaffer, Managing Director, Insurance Panda

Context-Based Rules Beat Age-Based Rules for Regulated Data

I look at cloud storage through a GxP lens first: not all data deserves the same performance tier. One of the best optimizations is separating “active validation work” from “audit evidence and historical artifacts” instead of keeping everything in the same high-cost path.

We identified candidates for lower-cost storage by mapping data to business use, access frequency, and compliance need. In practice, things like current requirements, open test executions, and live traceability stay hot, while completed validation packages, older evidence attachments, superseded versions, and historical exports move to cheaper archival tiers once they're no longer part of daily execution.

The key is not just age-based rules, but context-based rules. If a record is closed, version-controlled, retained for audit, and rarely touched except during inspections or investigations, that's usually your signal it can move down-tier without hurting the user experience.

In validation platforms, teams often overspend because they treat screenshots, attachments, logs, and older document sets like transactional data forever. We've seen much better outcomes when you keep metadata, indexes, and audit readiness instantly accessible, but push the heavy binary evidence into lower-cost storage behind the scenes so users still experience a single system.

Stephen Ferrell, Chief Product Officer, Valkit.ai

Audit File Access Logs Before — Not After — the Migration

The biggest hidden cost I see consistently is companies paying for “always-on” access to data that nobody's touched in 18 months.

The fix that's made the biggest difference for us: before any migration, we audit file access logs to separate active working files from archival data. For one manufacturing client we moved to Azure, we found that a significant chunk of their stored data was old project documentation — needed for compliance, but rarely accessed. That stuff doesn't need to live on premium storage.

The practical move was pushing that compliance and archival data into cooler storage tiers within Azure while keeping their active operational files in SharePoint for daily collaboration. Same data, same accessibility when needed, meaningfully lower monthly bill — and their team didn't notice any difference in day-to-day workflow.

The diagnostic question I tell every business owner to ask their team: “What data do you access daily versus what do you need to keep but almost never open?” That one question usually surfaces exactly where you're overpaying.

Roland Parker, Founder & CEO, Impress Computers

Mapping Data to CMMC and HIPAA Mandates Before You Tier

I use Azure Blob Storage lifecycle management policies to automatically move data between Hot, Cool, and Archive tiers based on access frequency. This ensures we only pay for the performance level actually required by the business.

We identify candidates for cheaper tiers by mapping data to retention mandates for frameworks like CMMC 2.0 or HIPAA. For example, audit logs and inactive project documentation are moved to Archive storage since they are only needed for regulatory proof.

This approach prevents “storage sprawl” and transforms your cloud footprint into a lean, audit-ready asset. It allows for high-availability where it counts while slashing costs on dormant data.

Michael Gaigelas II, President, Compliance Cybersecurity Solutions

Tiering Voice AI Call Recordings by Access Logs, Not Retention Rules

The biggest single cloud storage win we shipped was cutting our S3 bill by roughly 60% on call recordings and transcripts by tiering based on actual access patterns rather than a fixed retention rule.

Here's what we did. Every call our voice AI handles produces an audio recording, a structured transcript, and a JSON event log. In the first 30 days these get hit constantly: customers replay calls, we pull them for QA, automations reference them for follow-ups. After 30 days, access drops sharply but doesn't go to zero, mostly the customer pulling an old call for a dispute or compliance reason. After 180 days, access is rare enough that latency stops mattering.

To identify what could move, I turned on S3 Server Access Logging into a separate bucket, ran an Athena query against the last 90 days of GET requests grouped by object prefix and age, and got a clear histogram: roughly 92% of reads were against objects under 30 days old, less than 2% against objects over 180 days. That data made the policy obvious.

The lifecycle rule we deployed: Standard for 30 days, Standard-IA for days 30 to 180, Glacier Instant Retrieval after 180 days, and a permanent delete at 7 years for compliance. Glacier Instant Retrieval was the key choice over plain Glacier because we still needed sub-second retrieval for the rare case a customer pulls a year-old call.

A few practical lessons. First, look at access logs, not assumptions; people guess wrong about what's actually cold. Second, watch out for small-object overhead in IA tiers (under 128KB you pay for 128KB), so we batched tiny JSON event logs into daily aggregates before transitioning. Third, tag everything by customer and feature, so when finance asks who's driving cost, you can answer in one query.

Peter Signore, CEO, Dynaris

95% of Old Call Recordings Were Never Re-Accessed — The Log Proved It

The cloud storage cost I cut most was on customer call recordings. We were keeping every recording in hot S3 storage indefinitely “in case support or sales needed to pull it.” The audit revealed that 95% of recordings older than 30 days had never been re-accessed.

The way I identified what to move was running a one-off query on access logs: for each object in the bucket, last accessed date and frequency over 90 days. Anything not touched in 30 days went to S3 Infrequent Access. Anything not touched in 90 days went to Glacier Deep Archive. Storage spend on that bucket dropped roughly 70% inside a month. The lesson worth sharing: don't tier by file type or by team's stated needs. Tier by actual access logs. The “we might need it” intuition almost always over-estimates how often historical data actually gets touched.

Natalia Lavrenenko, Marketing Manager, Smarfle CRM

Tag by Type, Age, Owner, and Need — Then Let Lifecycle Rules Do the Work

A very practical approach I utilized to achieve cost savings when using cloud storage is creating lifecycle policies for data that was being retained indefinitely in costly “hot” storage locations even though they weren't being regularly accessed. Data such as logs, backups, old reports, media files, and exported datasets tend to remain in S3 Standard, Azure Blob Hot, or other similar tiers in an AWS or Azure environment for extended periods (often measured by months). In order to identify the storage location with the least cost, I used tools designed to analyze access patterns such as AWS S3 Storage Lens, CloudWatch metrics, Azure Cost Management, and storage access logs, then moved these low-access datasets to less expensive storage tiers such as S3 Standard-IA, S3 Glacier, Azure Cool, or Azure Archive.

What was important, however, was not making assumptions. All datasets were tagged based on type, age, owner, and need to the business, then basic rules were applied. For instance, all datasets that had not been accessed in the previous 30-60 days were transferred to S3 Standard-IA, while all compliance-related backups older than 90-180 days were transferred to S3 Glacier. Detailed reviews by humans were maintained for all legal, financial, and customer datasets before they were moved to an archive tier. I have also found that this tiering strategy can result in reducing overall storage costs by 20-40% without sacrificing performance — provided that teams established retrieval times, retention schedules, and restore testing prior to implementing these rules.

Pratik Singh Raguwanshi, Manager, Digital Experience, LiveHelpIndia

Trial-Restore Before You Commit — Cost Savings Can't Create Operational Risk

Methodically organizing data that is not frequently accessed, such as archives or infrequently used items, creates a significant opportunity to minimize expenses associated with using cloud services. One approach would be to shift assumptions about which items fall into these categories to an operational trail documented with Access Logs, Metadata on objects before the move, and any other Lifecycle behaviors for the items of interest so that they are understood, evaluated, and initiated based on Accessed/Modified Activity in the last 90 to 180 days. Common examples of such data may include backups, exports, audits, media, and more.

When examining items for the separation of cold from Business Critical Data or those needing immediate accessible is the biggest challenge; therefore, an uploaded item will be marked as standard Storage by customers, while an item associated with Historical Monthly Reporting could be altered to an Infrequent Access storage solution that would require longer to retrieve (e.g., 24hrs vs 2 mins). Should you implement Lifecycle Rules, trial-restore items to track the amount of time needed for data retrieval, tag/assign Value for Items based upon their use (i.e., purpose) Owner (e.g., assign to the Individual responsible) and Retention. Also, any cost reductions should NOT present Operational Risks. The success of obtaining the anticipated cost savings in lower priced Cloud services will be realized if you successfully retrieve the required data at the time scheduled.

Cameron Woodford, CEO and Founder, Appello Software

Not All Data Deserves the Same Tier — And Not All Decisions Are IT's Alone

One approach to decreasing costs associated with using cloud-based solutions is developing lifecycle rules that automatically move older data into cheaper tiers when there has been little access to this data over time. One of the most common mistakes companies are making today is treating all data the same way — customer data that is continuously accessed (like customer addresses), weekly application log files generated from last week's transaction, as well as archived data generated over two years ago, do not require the same cost or speed to retrieve.

In order to identify what data can be relocated from one storage device to another, access logs and file creation date must first evaluated. For example, a simple way to create a procedure for performing the lifecycle rule would be to assign a tag to the data based on last access date/time, business value and intended recovery method (e.g., monthly log files that have not been accessed within the past 90 days would be moved to the infrequent access tier; compliance-related archived logs that were generated over 12 months ago could be moved to the cold/historical storage tier). The main objective of the lifecycle rule is to reduce costs by moving data; however; proper planning will ensure the data that can be moved is agreed upon by all affected parties (i.e., finance, legal and operational) and therefore reducing the possibility companies will not have access to the data due to a long retrieval time after activating the lifecycle rule.

Neil Webzell, CEO, Trafalgar Wireless

Compare Cloud TCO Against On-Premise Before Defaulting to the Cloud

We optimize costs by evaluating the total cost of ownership between cloud backup and onsite (on-premise) solutions, particularly for high-bandwidth “Bare-Metal Recovery” images. For our medical clients, we've found that utilizing local redundant drives for immediate failover while pushing only essential databases to the cloud significantly reduces data transfer fees.

To identify what moves, we compare the necessity of real-time “Redundant Servers” against “Disperse Mirroring” in a multi-building campus environment. This ensures your storage tier aligns with your specific disaster recovery requirements rather than a generic, expensive cloud plan.

Ryan Miller, Managing Partner, Sundance Networks

Key Takeaways: What the Best Cloud Storage Optimizers Did Right

A few patterns stand out across these answers.

Nobody who's actually optimized storage at scale relied on intuition. Every credible answer started with the same step: pull access logs, look at the real data, and let usage tell you what's hot and what's cold. Teams that tiered on assumptions almost always over-estimated how much of their data was truly active.
The savings also come from automation, not one-time cleanups. Lifecycle policies, tagging at creation, and recurring reviews are what keep cost optimization compounding as new data flows in. Spring cleaning buys you a single billing cycle of relief, but lifecycle automation makes the savings permanent.
Storage tiering also isn't the whole story. Capacity reservations on managed services like QuickSight SPICE and OpenSearch reserved nodes quietly drift out of sync with actual usage. Orphaned volumes, forgotten snapshots, and unused commitments all hide outside the standard storage cost report. The teams capturing the biggest savings build recurring reviews into their cloud operations cadence, not just lifecycle rules into their buckets.

If you're starting from zero, the path is straightforward. Turn on access logging, query the last 90 days of reads, and you'll have your tiering policy in hand within an hour. From there, the work is automation, governance, and keeping that visibility intact as your data footprint grows.

Visibility, ultimately, is what most teams underinvest in. You can't tier what you can't see, and you can't sustain savings if the standard cost report is the only place you're looking. The teams capturing the biggest wins go deeper to look at access logs, usage patterns, tags at creation, and recurring reviews of the managed services that may drift out of sync with actual usage.

But visibility alone doesn't lower the bill. The execution work — running the access-pattern audit, designing the right lifecycle policy, validating retrieval times against your workflow, and building the governance cadence to keep it intact — is what actually moves a storage line item by 25 to 70 percent. That's the work most teams either skip or underestimate (and what we help with).

Stratus10 helps AWS customers find where their cloud spend is actually going, and Kalos automates the policies that keep waste from creeping back in. If you're looking at a cloud bill you can't explain, that's where to start.

data storage

Cost Optimization

aws best practices