Art of scalability

I will write two or three articles about scalability. I will call the first one ‘Scalability principles’ and I will speak about what is the scalability and its principles, and then I will move to Scalability pattern, anti-pattern, and guidelines in next posts. So let’s start.

Go to : Part 1- Scalability principles

Go to : Part 2 – Scalability guidelines part 1

Go to : Part 3 – Scalability guidelines part 2

Scalability guidelines for building scalable software system – part 3:

Measure the right thing:

Back to Performance anti-pattern post, you should measure the right things. For example you can’t ask me  ‘How many user do you have in your Application?’ to measure the scalability of my application. you should ask me the right question such as ‘How many concurrent user do you have in your Application?’. Select and measure the tight thing is very important, for more information please read ‘Measuring and Comparing the Wrong Things’ section into Performance anti-pattern post.

Understand the Semantic of the lower level libraries:

Abstraction is very good concept, but understanding the semantic of the lower level libraries is important to design and develop a scalable software system. If you have many application servers and integration between them, so you may ask yourself:

  • Does your RPC infrastructure create TCP connection to all or some of your back-end servers?
  • What is the different between the APIs that exist into the lower level libraries?

Fact – Any Scalable System is a Distributed system:

As I said before: Partition your system into small pieces that can optimized or scale independently, this also mean: these small pieces can be distributed in different physical servers. So you can create a load balancing layer (layer that deliver the request to the most near and healthy server). In distributed system you need to take care of some points:

  • Fault tolerance

There are important role in distributed systems, if you think that everything is reliable so you are wrong, you should take care of fault tolerance, and how you should handle any error are occur. For example: any sick server shouldn’t accept new requests, and error in request processing should handle and isolated.

  • Monitor the servers healthy and load:

Monitor the servers healthy and load is fundamental in any load balancing system, because this will help to deliver the request to the most near and not busy server, in other hand archive this data for offline processing and view will give you great ability to know what is the busy time, what about the servers healthy, etc. this information will help you to know when I should add more hardware or improve my software for more scale, also this information can tell you what is most slower piece in my system.

Different kind of data need different kind of partition:

When I speak about partition, I was speaking about code and sub system partition, but data partition is also important. For example:
If you have photos search engine that provide thumbnails will the results. Partition the thumbnails data around many servers is so important, why?
The problem with thumbnails images is there are a lot of them and they are small around 10KB, and in every search request will make the browser send a lot of requests to view them. So partition the thumbnails images in many servers is important.
What I mean by the example, is partition the data is also important maybe more that software, and you should understand the semantic of the problem/Challenger that you have to be able to choice good solution.

My References:

  • Some videos about Scalability from Googole video.
  • Building Scalable Web Sites – O’Reilly.
  • Wikipedia
  • InfoQ.com
  • My experience
Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Live
  • Reddit

5 Responses to “Art of scalability (4) – Scalability guidelines part 3”

  1. Mark Rose Says:

    This 4 part article is an excellent summary of the keys ideas behind building a scalable/distributed system. Thanks!

  2. Jorrit Schippers Says:

    … you should ask me the right question such as ‘How many concurrent user do you have in your Application?’. …

    That question isn’t right in all cases. Sometimes it is “How many actions do your users generate on your application?”

    For instance, for an instant messaging application, you can handle lots and lots of connections. As long as those users don’t generate any actions, such as messages and presence updates, you can support many concurrent connections using libevent.

  3. Maan Says:

    Just went through the 4 articles. It was very informative. Thanks.

  4. JaffD Says:

    Great article. Thanks for the time and effort you put into it.

  5. Scalability resources | Unix Stuff Says:

    [...] Guidelines fo&#114 building sc&#97l&#97ble soft&#119&#97&#114e syste&#109 (p&#97&#114t 2) – Sc&#97l&#97bilit&#121 Guidelines f&#111&#114 building sc&#97l&#97ble s&#111ftw&#97&#114e s&#121stem… – Scala&#98ility Wo&#114st P&#114actices – how to &#109ini&#109ize loa&#100 ti&#109e &#102or [...]

Leave a Reply