From Tools to Teammates: Building, Evaluating & Orchestrating Autonomous AI Agents
Wix Engineering Meetup
Monday, June 15, 17:30 - 20:00
Wix Campus. TLV

About our meetup
Dive into the next chapter of AI-native software development. As AI agents evolve from tools into teammates, the challenge shifts from simply generating code to orchestrating, evaluating, and trusting autonomous systems that think and collaborate together.
In this meetup, we’ll explore how teams are building multi-agent systems - and the new engineering layers emerging around AI reliability, evaluation, and decision-making at scale.
Agenda
17:30 — 18:00
Gathering: Pizza & drinks
18:00 — 18:45
Dror Arazi
Lead AI Software Architect, Wix
Mission to Mars: When AI Agents Negotiate 🚀
Very soon, your job won’t be writing every line of code - it’ll be designing and guiding AI agents that think, challenge, and solve problems alongside you.
In Dror Arazi's simulated mission to Jupiter, the system should have failed - there wasn’t enough fuel. Instead of crashing, AI agents debated, rejected the plan, and reinvented it, negotiating a gravitational slingshot to trade time for fuel.
That was the breakthrough moment: this wasn’t code execution, but a room full of AI agents (engineers) solving a problem together.
The real shift is from controlling every step to setting the mission, the rules, and trusting the team. The future of software isn’t workflows - it’s AI crews on a mission.
If you want to learn how to work with agents like teammates - how to set constraints, guide decisions, and build systems that go beyond execution - this session will give you a practical starting point.
18:45 — 19:00
Break
19:00 — 19:40
Or Goldreich
Full-Stack Developer, Wix
How Do You Know Your AI Agent Works? Building EvalForge 👷♀️
As AI shifts from a supporting tool to the core of our products, the key question changes - from "does it work?" to "how do we know it’s working well?"
In this talk, by Or Goldreich, we’ll introduce EvalForge - a platform for systematically evaluating AI agents across test scenarios, capabilities (skills, MCPs, sub-agents, rules), and assertions - from LLM judges to tool invocation checks - turning agent behavior into something measurable, debuggable, and comparable.
Beyond the technical architecture, we’ll explore why AI evaluation is fundamentally different from traditional software testing, the hidden challenges teams encounter when agents operate in production, and what changed once we could actually measure reliability, decision-making, and failure patterns at scale.
19:45 — 20:30
Networking
Speakers

(01)
Dror Arazi
Lead AI Software Architect, Wix

(02)
Or Goldreich
Full-Stack Developer, Wix
RSVP
Monday, June 15, 17:30 - 20:00
Yunitsman St 5, Tel Aviv-Yafo
For more detailed instructions check out this navigations guide.
More from Wix Engineering





