top of page
Writer's pictureWix Engineering

Moving the Needle: 10 Lessons Learned from our FinOps Evolution

Updated: Jun 8, 2023


The pain point

As we navigate through the winds of economic uncertainty, it's important to keep a watchful eye on our expenses. Especially when it comes to cloud services, where the invoices can quickly swell if left unmanaged.


FinOps

Photo by John Cobb on Unsplash


For many organizations, cloud expense is a major (and impactful) cost. The ability for engineers to become the purchasing team created a unique challenge - governing the spending and efficiency of our workloads. With the increasing awareness of the critical need for efficient management of the production environment, more and more companies are taking proactive actions to reduce their cloud expenses, starting with implementing FinOps within the organization.


Based on the FinOps Foundation’s “State of FinOps 2023”, three of the “Top Challenges” for practitioners are empowering engineers to take action, organizational adoption of FinOps, and leadership buy-in for FinOps. In this article we will present how we approached it at Wix.



The Solution

Based on the FinOps Foundation's definition of FinOps, it’s “an evolving cloud financial management discipline and cultural practice that enables organizations to get maximum business value by helping engineering, finance, technology, and business teams to collaborate on data-driven spending decisions.”

Wix’s FinOps journey started in 2017 and has since evolved by adopting effective management strategies, focusing on the engineering side of FinOps, and taking proactive steps to reduce cloud expenses. Wix has demonstrated it's possible to reduce cloud expenses by more than 25%, even after applying commitments and reservations.


During our FinOps journey, we have identified several key elements that have facilitated a successful cultural change and allowed us to implement a framework that enables engineers to approach problem-solving in a much more cost-sensitive manner.


Here are 10 steps and lessons learned from our Wix FinOps journey:



1. Focus on waste metrics, not cost

Many “FinOps” tools focus heavily on cost dashboards. The problem with this is that they don’t provide actionable insights. For example, a $50k EC2-Other cost may or may not be acceptable, and it's unclear whether any action needs to be taken. Instead, focusing on waste metrics can provide more useful information.


For example, if I have a c5.9xlarge in N. Virginia ($1.53/hour) and I’m using only 30% CPU on it, I can create a dashboard showing “waste” of $1.53*70% hourly. If I have an EBS volume with a 20% utilization capacity, I can show a “waste” metric of $0.1/GB*80%. If I have a Dynamo table provisioning 100 IOPS and using only 20 IOPS, well - you get the idea. Arranging such a dashboard from top to bottom will actually show us where we need to put our attention and generate action items for the engineering teams in the next sprint.



2. Build trust, and maintain it - accuracy is key

Building trust with engineering teams is a gradual process that requires many small building blocks. However, trust can be easily broken by a single small misstep. Our goal is to make sure that the data, financial information, measurements, and models we’re building are as accurate as possible. We need to make sure we are using the correct cost metrics (amortized vs unblended in AWS, cost and credits in GCP, effective reservation allocation, etc.) and make sure our models are robust. The worst thing we can do is present a cost reduction number to change the engineering team’s priority, just to understand later on we took the on-demand cost while this workload was covered by reservations, and we see half the expected reduction. Be accurate, and keep the trust!



3. Be Proactive

If you’re waiting to see how your environment is doing from the cost reports - you’re missing the entire point of FinOps.


Acting on costs is a reactive process that handles the incidents after they occur. Our goal is to proactively govern our cloud spend by aligning with our company’s roadmap, integrating with the engineering provisioning pipelines, monitoring business growth to anticipate when we might need to increase our footprint, identifying pitfalls and missing visibility in our environment, and implementing other proactive predictions and engineering processes to stay ahead of the curve. Every dollar not wasted on the cloud is a dollar saved!



 

Watch Dvir Mizrahi and Nathan Besh talks about Wix’s unique approach and the key drivers for FinOps adoption as well as addressing the need for capability in the teams and building competent operational processes around FinOps:


 

4. Be that “go-to” professional to apply the financial KPI

Many companies manage their engineering pipelines differently. The best lesson we learned was that we have to be part of that pipeline. FinOps shouldn’t be an “after the fact” consultant, but be part of the provisioning and planning process, and escort that journey closely to apply that “elusive” financial KPI.


Only that way we can assure that people are mindful of that KPI, and slowly but surely create an engineering cultural change that creates engineering advocates for cost efficiency.



5. Stop reserving waste

Most of the FinOps practitioners we talked with measure their success in their journey by mentioning reservation covered capacity percentage using AWS/Azure reservations or GCP CUD, saying “we are 100% covered and 100% utilized.” We continue to remind them that the “30% waste” on the cloud refers to utilization waste, regardless of whether financial plans cover our environment. Bragging to be covering your waste with reservation is not always the right FinOps approach, and we learned that pushing forward with optimizing our underlying hardware can bring much more value in the long run, as it reduces the required commitments. If you’re just starting, you can start by having this “quick win” method to reduce costs, but don’t make a practice out of it.



6. Apply your business knowledge

No. We’re not talking about tagging your environment - that is obvious. We have a Wix saying: “hardware provisioned that is not serving our business has no right to stay up.” The idea behind that is simple - we are running resources in the cloud to serve our business, not just for fun. If we can’t understand how our workload serves our business, it might be difficult to justify the ROI of that workload. Examples of it might be a database only used for writing without reading (or 0 active sessions in the last 30 days), an application to serve requests without having any requests sent to it, or any other use case in that area {apply your business here}. If you don’t know what that workload serves, and how it impacts your business, you should flag this workload as waste, until proven otherwise. If you can’t measure its value, is it really worth the money?



7. Success stories, Celebrate wins, No matter if it is big or small

One of the biggest challenges in the FinOps world is how to create a cost-sensitive culture in the organization and change the mindset of our engineers. One of the ways we successfully build engagement is by celebrating wins, as well as team initiatives, creativity, and out-of-the-box thinking when it comes to optimizing their workloads. As soon as we mention the success stories, people start thinking and looking for opportunities for projects that improve efficiency in the production environment. We, as FinOps Engineers, can relay to management all the initiatives that took place during the month and how they contributed to the business. Sharing the success stories and giving credit to the relevant engineers and teams builds trust and increases engagement. It’s a powerful tool for an organization's ability to inspire, motivate, and build credibility with stakeholders while driving growth and innovation.



8. Financial Incidents are production incidents

Mistakes are inevitable, but it's crucial to learn from them and improve internal processes within the organization. In FinOps, a financial incident refers to any unexpected expense that falls outside the defined budget and isn't a result of natural business growth. When we encounter a financial incident, we will follow the same process as we do for production incidents. We don’t need to reinvent the wheel - we simply fit into the existing process. We will open a financial incident in the same way as we open production incidents and work towards a quick resolution. Once resolved, we will conduct a post-mortem to identify how to avoid future occurrences and take actionable steps.


 

Listen to Dvir Mizrahi tell the story of FinOps - developing a smarter and better financial engineering culture:


 

9. Common Language

In the FinOps community, you often hear the phrase “Cost Aware” organization.

The problem is that language matters. If you can't create action items for the organization to do from the "cost" metrics (and you can't), you can't drive engagement. This, in turn, hurts the effort to drive a cultural change. What is the point of being "aware" of costs, when the follow-up questions and action items are what matters?


Instead of being a "Cost Aware" organization, advance the policy of being "Waste aware" - which generates actionable dashboards and action items for the engineering teams.

Instead of being a "Cost Aware" organization, be a "Cost Sensitive" organization, which doesn't wait to be aware of the costs to implement a policy in the design phase to reach a cost-sensitive technical solution, not cost-driven.

Instead of being a "Cost Aware" organization, promote a "Business Aware" culture to apply business unit economics to our workloads to measure its efficiency. Not how much things cost, but how much value and impact they generate.


Language matters. We changed the terminology from "Cost" to "Waste", changed "Spend" to "Investment", and "Underutilization" to "Efficiency score" - those things mattered, and helped us in our FinOps journey to drive the cultural change



10. Be curious, ask questions

When we created the content for the FinOps Free Certification Program in Israel, the first things we identified to be a great FinOps engineer are soft skills. The ability to drive agendas, engage with peers, and educate your organization mostly comes from having good communication skills, curiosity, and being a people person.

The ability not to take things for granted, ask the hard questions, and insist on getting a full response to why things are the way they are is a key factor in driving change. Most of the time, engineers can explain why they are doing something, or what the solution they implemented does. But most often they can’t really explain if it is actually the best way of doing something, or what the problem is that they are trying to solve.


Being curious by asking follow-up questions, engaging with different teams in fruitful conversations, and challenging the way engineers think about their solutions makes all the difference. This way we’re avoiding the “worst best deal”, solution-driven solutions, and wrong justifications.



Conclusion

Driving a cultural change in an organization is not an easy task, but it’s also not impossible to achieve. Building this practice from the ground up involves changing the engineering mindset, engaging in different agendas to increase the financial efficiency of workloads, speaking the same language as engineers to challenge how new workloads on the cloud are approached, showing them where to put effort based on accurate data, and integrating into engineering pipelines. This empowers the teams and generates success stories.


These success stories are transformed into a form of “currency” that the FinOps engineer can use to “purchase” leadership buy-in, ensuring that the organization’s management level is engaged in their activities. In this cyclical process, we create a feedback loop that facilitates the organizational adoption of FinOps, fostering a culture of ongoing financial optimization and collaboration.



 

Dvir Mizrahi and Ziva Tubul

This post was written by Dvir Mizrahi and Ziva Tubul


 

More engineering updates and insights:


bottom of page