Wix Continuous Security Posture Management- Pt.1

Wix Engineering
Jun 1, 2022
4 min read

Updated: Aug 11, 2022

How is Wix’s organizational structure unique?

Wix is organized around “companies”. Each company delivers a specific functional component of the Wix platform. Structuring Wix this way enables everyone to belong to a small, intimate team while still contributing to the larger organization.

Photo by Scott Webb on Unsplash

This structure, combined with our “guilds” structure, enables us to keep developer velocity high and bureaucracy low, maximizing individual impact.

How does it affect the security efforts?

Each of these guilds and companies may, from time to time, have their own infrastructure needs, which don’t quite fit the existing centralized structure. Therefore, they’re entitled to create and manage their own cloud infrastructure.

While it seems like a great approach business wise, just imagine the complexity of maintaining such an infrastructure - and securing it!

To emphasize and amplify the complexity of things further: not only do our companies operate autonomously, but we also have hundreds of cloud accounts and networks, over a hundred infrastructure engineers and devops personnel, thousands of different services, and thousands of developers! All contributing to this massive infrastructure.

And guess what? Our security team is dramatically outnumbered…

That’s complex indeed... How do you go about securing it all then?

Good question.

This is where our team lead came up with an idea of simply “accepting” the decentralization of Wix and finding (or creating) a solution that can literally manage the entire infrastructure security operations - all at once and in collaboration with our infrastructure owners themselves.

So, without hesitation, we started designing our holistic security system, as if it was as easy as popping up an EC2 in the cloud.

We put all our needs in one place and fantasized about our ideal security solution:

It had to be as automated as possible - we could never really survive if we had to do everything manually.
It had to be as comprehensive as possible - Wix uses a lot (and I mean a lot!) of different technologies and we had to be able to integrate them all in the process.
It had to be built around prioritization - our main pain point as security advocates is that most tools rarely prioritize things properly.
It had to have strong querying and correlation capabilities - we need to be able to identify transverse gaps throughout the stack.
It had to be able to visualize things - complex data is always more readable when it is shown via charts, graphs and visualizations rather than simply as text.
It had to support “collaboration” and “actionizing” of things - the proposed solution needed to have the capability to notify of new findings, create matching tickets and even enforce changes.

Eventually, after working with multiple vendors and ending up with plenty of different security systems in use - we realized we can no longer see the full picture!

Each tool had its own set of alerts, no correlation with other tools’ findings, different criteria for severity, separate reporting and automation flows, and worst of all - we couldn’t simply “throw” it all at our infrastructure owners and engineers.

That’s when we realized that we had to have some “mothership” system which could then gather all of the data in a centralized place and put everything together - and that’s when our platform was born!

I will stop for a minute and say that in this phase people usually ask us: “Why not just ship it all to the SIEM and work from there?”. This is important to understand - there’s a difference between incident management and security posture management.

We believe our SIEM needs to remain our incident response platform and not be our security posture management platform. From our perspective, it should only be aggregating and correlating logs which represent actual actions taken in our environment, as they occur.

It doesn’t mean that we don’t enrich our data with valuable information from our asset management system, but that it doesn’t manage the posture itself.

Back to our system - what does it do? In a nutshell: it takes raw data and alerts from our security systems and all of our different infrastructure components (yes, across multi-cloud vendors) and normalizes them into a single dataset.

Everything is then stored in a relational database, allowing us to correlate data from all of our systems and create actionable insights and alerts, connecting directly to our infrastructure owners. Each and every infrastructure owner receives their own tenant on the platform, where they are then able to review Compliance Issues, Configuration Issues, Vulnerability Issues, and more - everything prioritized based on risk and probability for their environment.

The system is integrated with Slack and Jira, so every new finding is recorded and sent directly to the owner.

Let’s look at the system a bit closer:

Main Dashboard - Provides a quick overview of the security posture score of the entire environment, ranked by our sub domains.
Asset Management - Allows an owner to receive valuable data about their entities and resources within their cloud and non-cloud infrastructure; things like Users, Roles, Groups, Compute, Datastores, balancers etc.
Configuration Management - Helps find all of these scary misconfigurations that can lead to a complete exposure (like publicly accessible buckets, datastores, highly permissive roles/groups/users etc).
Vulnerability Management - By creating an inventory of all of our installed software, applications and libraries from all workloads, and comparing that with available databases of known vulnerabilities, we’re able to identify all of the pending vulnerabilities.
Security Compliance - Helps track the coverage of our security tools in the infrastructure itself.
Alert Center - The holy grail of our system! This is the main page where all of our alerts are aggregated into, describing the finding, the risk, projected spread, suggested mitigation and tracking for each alert.

Using smart correlations between different types of alerts from different domains (like Configuration Management and Vulnerability Management), we are also able to identify extremely risky “attack paths” where both the risk and the probability are high.

For example: an EC2 instance which exposes HTTP to the world uses IMDSv1 and has an attached role that is highly permissive.

Given the circumstances, if an attacker could find an SSRF in the web application running on this machine, things could end up pretty bad! Good thing we saw it coming.

To learn more about how we actually got to our amazing solution and the stages we had to go through, continue to Wix Continuous Security Posture Management - Part 2.

This post was written by Opher Hofshi, Security Architect at Wix.

For more engineering updates and insights:

Follow us on: Twitter | Facebook | LinkedIn
Join our Telegram channel
Visit us on GitHub
Subscribe to our monthly newsletter
Subscribe to our YouTube channel
Follow our Medium publication
Listen to our podcast on Apple, Spotify or Google

Wix Continuous Security Posture Management- Pt.1

How is Wix’s organizational structure unique?

How does it affect the security efforts?

That’s complex indeed... How do you go about securing it all then?

Recent Posts

Comments