Ad Server: System Design & Architecture
In a recent post I mentioned my next big project is to build an ad server engine. This will basically respond to ad requests and will record stats for served ads (such as impressions, clicks, custom interactions and whatever else you might want to track). This post is about the system’s design, how its different parts will interact and how it should behave in certain scenarios.
The goal is to have a system that’s scalable! Ideally, doubling the number of components (servers) in the system should double its performance as well. Redundancy on the other hand is not that important, at least not for this particular application. As we’ll see later on, the engine will be able to function correctly even if some of its components (temporarily) crash.
I’ve done quite a bit of research lately on building scalable and redundant systems using MySQL databases. Almost all articles on this topic mention something about replication. At some point I was tempted to use circular replication, to get (unnecessary) data redundancy, however this was not going to help me get the much more important scaling, which is very important, especially for ad servers.
Let’s start by looking at what kind of data is used by a typical ad server. I could identify 3 major categories:
- Configuration Data
This contains information about current campaigns, targeting parameters, CPM, target impressions, etc. The engine will mostly use this data in a “read-only” mode. In my particular case, this data comes from a completely separate admin application. The frequency at which this data changes is relatively low, at least when compared to the other 2 categories below. - Runtime Data
This contains information generated during the normal operation of the ad server engine. You will find here things like counters, session data, etc. The engine will both read & write this data in almost equal proportions. - Reporting Data
This contains everything that’s being logged about served ads (clicks, interactions, timers, etc). This information is mainly written to the database and otherwise not required by the ad server engine. In my particular case, this information will be picked up by the completely separate admin application and used to generate reports and adjust the configuration data.
The configuration data will be quite small in size (at least when compared with the reporting data) and will most likely be cached for super-fast access. The runtime data will also be kept in memory (as much as possible) but will also be saved in the database for persistence. The reporting data will be by far the largest in size and we’ll need to save this to the database as soon as possible (we’ll use an intermediary cache to reduce the number of WRITE operations to the database).
If I take into account one of the goals for this ad server engine (to be able to serve 1 billion ads per month) and assume that for each ad served an additional 10-15 database WRITE operations are required (I’ve exaggerated a bit here), it soon becomes obvious that scalability actually means in this case being able to cope with all those database WRITE operations.
I’ve mentioned above that almost all articles I’ve read about scaling MySQL talk about replication. Well, this is only useful when scaling READ operations. Scaling WRITE operations will only work by data partitioning (or sharding). Quick example: if you have an application with 100 million users, you most likely don’t want to keep all of those users in the same DB table because it will become unmanageable. Instead, you’ll split those users into say 20 tables each storing 5 million users. How do you determine which user goes into which table? Simple, you make this decision based on the user ID (using modulo 20) and this will make the user ID your partitioning key.
Since a picture is worth a thousand words, here’s how the ad server engine will look like. It’s important to realize that almost all the computers from the image below do not necessarily need to be physical boxes and the entire system will happily work on just one server.

Ad Server Engine Architecture
Most likely each web server and its corresponding database server (black box and gray box right above it) will actually be on the same physical server running and HTTP server and a MySQL database. The “…” means that we will be able to add more servers to increase performance (horizontal scaling).
The Load Balancer will dispatch the incoming requests evenly to all available servers. The only data shared by the servers is the configuration and runtime data. Everything else is kept locally.
If a server crashes (not likely, but still) its local data will become temporarily unavailable, but here’s the nice thing about it: it doesn’t matter (more on this below)! The data kept locally by each server is mostly reporting data which is not time sensitive. The runtime information stored in RAM on that server will also be lost, but the RAM cache will actually be replicated on at least another server and unless both servers die at the same time there will be no loss of runtime data.
So, how it all works? Below are a few scenarios and the corresponding system behavior.
- New ad request is received
The Load Balancer will dispatch the request to one of the available web servers. The selected web server and its corresponding database server will become the “origin” for that ad request. An ID will be generated for the request which is actually composed of the servers ID and a locally unique ID generated by the DB (auto-increment field for example). Besides being a globally unique ID, the request ID will also contain the origin server’s ID, which is useful to know when subsequent requests are received that are dispatched to different servers by the Load Balancer. The origin server will process the request, update what needs to be updated and send back the result. Everybody’s happy! - The user clicked on the previously served ad
Again, the Load Balancer will dispatch this request to one of the available servers. The request this time will contain the request ID, because it’s a reporting/tracking request for an already served ad. The selected web server will inspect the request ID and determine which is the origin server and will use the appropriate database server to handle the request. So normally, all requests for already served ads will use the origin database keeping everything consistent, although the request itself might be processed by any of the available servers. So in this case, the selected server will contact the origin database (which is exactly the same as using its local database from a software perspective) and will process the request. Everybody’s happy!Now suppose the origin server is actually offline because it crashed right after processing and serving the initial ad request. In this case, the selected server (by the Load Balancer) will simply write whatever needs to be written to its local database (remember that all configuration and runtime information is still available). The beauty is that the selected server doesn’t need to access or read any of the local data stored on the origin server (which is down in our hypothetical scenario). The selected server will simply use its own local database and the available runtime data to process and serve the request. When the origin server will finally get back online, it could update its database with all the information he missed, but this is neither required nor important, it will practically be an implementation decision. Again, everybody’s happy (even if some servers crashed)!
This is it! If you’ve got questions or feedback, I’m looking forward to reading your comments. As I’ll go along and implement this ad server engine, I’ll write more posts about the interesting parts.
You can follow any responses to this entry through the RSS 2.0 feed. Responses are currently closed, but you can trackback from your own site.
Comments are closed.