Snowflake Cost Saving Opportunities

Snowflake Cost Saving Opportunities

About

We often hear from customers that Snowflake is expensive. In our opinion, Snowflake is a powerful database with a very unique architecture compared to traditional databases. As such, it deserves a different approach to effective management.

For example, in a traditional database, a single bad query can bring the entire database to a crawl. In Snowflake, that will not happen due to its innovative architecture. However, while this is excellent in terms of database performance and workload management, you may receive a surprise invoice for that bad query since it keeps the warehouse running for a very long time. (Check out our AvaWatch real-time query monitor App on the Snowflake marketplace).

The intent of this app is to identify potential saving opportunities on a high level before investing your money and energy for cost optimization efforts. The Premium Full version of this app gives you much more detailed insights and actionable recommendation steps to realize the cost-saving opportunities.

Email us at contact@ava-watch.com Or Visit us at: Ava-Watch.com. We hope we can earn your trust and opportunity to be part of your Snowflake journey.

How to run the app.

When you run the app for the first time, it will ask your permisson for Snowflake account_usage view, as well as warehouse the app will run in. Depending on the size of your system, if you have a lot of query activities (more than 100 million queries a year, for example), the app may take more than just a few minutes to run. You may consider to use a Medium size or larger warehouse for quicker response time.

You can recalculate the opportunities periodically to keep things in check.

How to interprete opportunities and actions to take

You will see one or more messages below if an opportunity is found. If nothing is found, a message will display: "No significant saving opportunity found."

1. Credits used by IDLE workload

IDLE occurs when the Snowflake warehouse is up and running, but there is no active workload running in it. In a well-configured and active system, you should expect IDLE to be well below 5% of your total cost.
A few things you can do to reduce the IDLE :

  1. Adjust down warehouse autosuspend configration.
  2. Optimize and balance workload across warehouses.
  3. When necessary, suspend warehouse manually.

2. Credits used by queries

As a rule of thumb, often the majority of credits are consumed by just a handful of repeated queries. Focusing on tuning those handful queries can yield significant cost savings.
Important: Snowflake query cost is not just associated with the query itself, but also the environment in which the query runs. For example, if your query is the only one running in the warehouse sequentially, then query cost will be much higher compared to running queries in parallel. Keep it in mind as tuning query soley based on runtime may not yield as much cost saving as you were hoping for.

3. Credits used by data processing.

For many organizations, data processing is the most expensive operation in the data warehouse. Optimizing cost for those top tables can reduce your overall cost significantly.
Optimization effort should also include data processing schedule. If a table gets updated very frequently (real-time, hourly, multi-times a day, etc.), cost will go up. Make sure the table gets utilized/read in similar frequency to justify the increased cost.

4. Credits used by auto-clustering

Snowflake Cluster/Partition is considered by many as one single most important performance tuning technique. However, maintaining clustering table can be costly if not treated carefully. Sometimes, the cost for maintaining clustering is comparable or even greater than data processing cost.
Reasons for excessive auto-clustering consumption may include:

  1. Suboptimal clustering key.
  2. Suboptimal data processing.
  3. Excessive auto-clustering frequency. This is likely a result of suboptimal configuration described above. Some organizations simply turn off auto-clustering occasionally to reduce the auto-clustering frequency

5. Credits used by unused tables.

In today’s world, Data democratization enables everybody in an organization to work with data, even create their own tables/processes. Sometimes, those tables are no longer used but data processes still left running and consuming credits. Turn off those unused processes and clean up unused tables, it is low-hanging fruit in your cost optimization journey.
Note: Tables can be shared to other Snowflake accounts, Make sure it is NOT being used by data share before cleaning up.

6. Storage used by timetravel and fail-safe.

Snowflake utilizes versioning technique which enables data retrievals back in time. In certain situation, especially when a table gets updated very frequently, Snowflake will have to make too many table versions than necessary, which results in excessive storage usage, hundreds or even thousands times more than the actual data itself. In such situation, please turn off timetravel and fail-safe for those high churn tables (utilize transient table and DATA_RETENTION_TIME_IN_DAYS 0).

7. Storage used by CLONE.

Zero copy CLONE enables us to create table copies easily without any additional storage, initially. Overtime though, excessive storage for CLONE can be accumulated as original and/or cloned tables get updated. Cleaning up unused cloned tables can save BIG in storage cost.

8. Credits used by failed queries.

Query failure is inevitable, but it does cost too. Periodically review failed query history, especially those failures with high aggregated cost will not only save you money by preventing the same failure in future, but also reveal otherwise unnoticable failed processes in your Snowflake.

Contact us

If you run into problems or would like to review the full version, feel free to contact us at contact@ava-watch.com, subject line: Cost Opportunity support. Or Visit us at: Ava-Watch.com