top of page

The Great Rewrite - How Wix is Preparing to Rewrite 100s of Systems - Part 1

Updated: 2 hours ago


Whether you are breaking down a monolith or rewriting a legacy backend service, all companies need to handle the challenge of rewriting systems. At Wix, we found ourselves in the process of rewriting hundreds of services and needed to plan accordingly. 


In this post, we will share the guidelines we created to help over 550 backend developers to perform this process as successfully, quickly, and efficiently as possible.


Platform Engineering


Why do companies rewrite their systems?


From a product perspective, it might seem more beneficial for a company to focus on developing new features rather than directing developers' time towards rewriting legacy systems without adding additional value to the customers. 


The question then arises: why do all companies undertake this task? Over time, maintenance costs of legacy systems rise and delivery times for new features become unacceptable. The reasons behind these issues boil down to developer velocity:

 

  1. Convoluted code that is hard to update.

  2. Use of legacy technologies.

  3. Remodeling of systems - As the understanding of the domain changes over time, the system’s representation often needs remodeling.

  4. Quality issues, which complicates the development of new features without breaking existing ones.

  5. Difficulty maintaining large monolithic systems. As your team grows, more and more people work on the same source code.



What does rewriting a service actually mean?


When thinking about rewriting a service, the obvious idea might be simply to rewrite the code. In reality, the service rewrite probably also involves changes in data schemas and new domain modeling. The process requires several steps including:


  • Rewriting the code, followed by testing, QA, etc.

  • Planning for backwards compatibility. Ensuring that the existing clients using the old system are not affected during the migration process. This is very important for Wix, as most of our APIs are open to external developers.

  • Copying data over from the old system to the new one. This includes:

    • Transforming data, as its structure would most likely have changed. 

    • Copying existing data while continuously duplicating new data in the background, assuming you can’t afford downtime.

  • Gradually redirecting traffic to the new system.



Wix and rewriting services


Wix has around 3,000 services, owned by over 40 product groups


Over the last 4 years, Wix’s backend infrastructure group has made significant investments in platform engineering. We’ve built a new development platform that drastically increases developer velocity and allows our developers to focus on business logic. (Learn more about it via this blog post under Wix engineering.) This platform allowed us to deliver unified features and non-functional requirements that are important to our users across all services. For example, it allows external developers (our customers) to add schematic data to any entity in Wix, as well as out of the box GDPR regulations and PII encryption support.


Given the positive impact of this platform over the past years, with over 400 new microservices already written with it, Wix has decided to migrate most of its legacy services using this platform.


Specifically for Wix, transitioning to this new platform requires a full rewrite of the services which includes:

  • Remodeling the domain model to align with open platform principles.

  • Breaking down monoliths into multiple services. 



Preparing for the mass rewrite


As the infrastructure matured, we realized that while writing new services is easy, the process of porting existing services to a new one is complex, especially given the specific requirements of Wix, such as maintaining an open platform. Development teams attempting to perform this rewrite had to plan this process without any existing guidelines, leading to sub-optimal processes, repeated pitfalls, and difficulties in completing the process and taking down the old system.


As the backend infra group, one of our core principles is commitment to adoption. This means that when we develop a feature, we pledge that it will be adopted by X number of users by a specified deadline. To support this principle, we aim to make the infra easy to use and tailor it to provide the exact features that are needed.


Given these considerations in addition to the massive rewrite, we realized we should approach this migration the same way we approach infrastructure features. As a result, we decided to write up guidelines that will streamline the migration process, and also to build tools around these guidelines in order to reduce duplication of work across different companies within Wix.



General approach


Before we could write up the guidelines, we realized we first have to choose our approach regarding data synchronization as if affects many other decisions

  1. Bi-directional sync between old and new systems.

  2. One direction sync between old and new systems.



Bi-directional sync between old and new systems


The bi-directional sync between old and new systems functions as its name suggests. With this approach, any data written to the old system would be available under the new system, and any data written to the new system would be available under the old system. Below is a simple diagram that describes the concept:


Infrastructure

The main advantage of this approach is that clients can switch between the legacy (V1) and new (V2) system at any time. Additionally, different clients can use different versions of the systems at the same time. Developers often turn to this approach by default.


However, there can be significant challenges associated with supporting the bi-directional sync, such as: 

  1. No single source of truth.

  2. Avoiding circular writes.

  3. Race conditions due to parallel writes to the same entity in both systems.

  4. Conflict resolution


Additionally, while the ability to rollback a client back from V2 to V1 might sound like a good thing, it gives a false sense of confidence. To sync data in both directions, one has to write conversion functions between the domain entities:  V1->V2 and V2->V1. The V1->V2 direction  gets thoroughly tested and the client transition to V2 is usually done carefully and gradually. On the other hand, V2->V1 migration is only used in emergency situations, making it difficult to test properly for all scenarios. 

After considering these complications, we decided to recommend against this approach.



One direction sync between old and new system


The one direction sync approach only syncs the data from the old system to the new system.

Platform Engineering

This approach is much simpler. However, it doesn’t allow for rolling back. Once a client has been moved to the new system, its updated data only exists there, thus requiring additional safeguards.


We recommend this approach, provided that the process includes sufficient safeguards.


Summary


In this article, we presented the reasons why companies rewrite their services, described the challenges both in general and specifically within Wix. Later, we described what led us into writing guidelines, as well as how we decided which general approach to use with respect to data synchronization between the systems.


In our next article (The Great Rewrite - How Wix is Preparing to Rewrite 100s of Systems - Part 2), we will deep dive into each step of the guidelines we created:


  1. Defining the scope

  2. System remodeling

  3. Code rewrite

  4. Ensuring backwards compatibility with the old API

  5. Rollout to new tenants

  6. Data migration

  7. Compare

  8. Rollout to existing tenants


 

This post was written by Roni Enzel Elman and Oded Apel


 

More of Wix Engineering's updates and insights: 

Comments


bottom of page