PDA

View Full Version : Please consider providing several scheduler models


tmart
06-19-2009, 06:43 AM
As I understand the behavior of the AppLogic scheduler, it seems to support something of a "backpack" dynamic programming model wherein the scheduler attempts to fill servers with components in order to leave sufficient resources for future large components. While this can be a good thing when we don't know what we will need to startup tomorrow, it creates resource contention and works against spreading I/O load across the grid. Usually we size grids based upon our knowledge of the largest applications.

As an example, if we have 4 physical servers and 10 virtual appliances total - but each with 2 volumes, the volumes (including mirrors) will be spread across the physical servers fairly evenly (I think), but all 10 virtual appliances will be started on the minimum number of physical servers (presumably to leave full server resources available for larger resource hogs in the future). If all virtual appliances can fit onto one physical server, then one physical server will be drawing from 20 volumes distributed across all 4 physical servers and the three "unused" servers are doing nothing except serving volume I/O.

There are at least a couple of problems that I see with this scheduler bias:


High availability -- "failover groups" aside, we have a hugely asymmetric risk with respect to server loss.
I/O contention -- setting inter-virtual appliance communication aside, the network contention for I/O seems too high. All 20 volumes streaming into one physical server.


I might suggest building several models for the scheduler and allowing these to be user-selectable. It might make sense to allow these to be selectable at the application level - or at the grid level... but I could even see a use for physical servers to be built into scheduler pools. Here are some models that you might consider:


one that provides the current "backpack" dynamic programming model.
Another model that attempts to minimize risk due to single server loss.
Perhaps another model that is biased for providing small clusters of communication (eg. IN -> Apache) that do not need to consume traffic on the physical NIC if the two components are scheduled to run on the same server.
On grids running multiple applications, a single application will tend to be consolidated on the same physical servers because the components will have (normally) been started up within a relatively small window of time. This means that application start & stop times will be impacted in a bad way (everything starting on the same server), perhaps causing timeouts or ill effects of already-running virtual appliances on the same physical server.


Perhaps these scheduler models could be somehow engaged by providing "hints" in a way similar to the current "failover groups" properties, but perhaps at the application level as well... and maybe also at the grid level, as described above.

Of course, we always need a means to relax any hard constraints induced by scheduler bias, similar to how we can now override failover groups resource constraints.

PeterNic
06-19-2009, 09:29 PM
Tim,

Thanks -- we will consider those. The server groups may work well. Another capability that will make initial selection less important is live appliance migration -- so that appliances can be moved around to make room.

One issue to consider is consistent performance. If we schedule applications in a way that each appliance gets different amount of resources (CPU, network and disk bandwidth) depending on whether there are other applications running, then we will start having side effects and interference between applications -- something that plagues other platforms to a degree where folks decide to forgo virtualization.

I do agree, however, that there are many cases when this may be acceptable or unimportant -- and getting the best you can get is good (as long as you test that the app still works ok with the minimum guaranteed resource assignment).

Regards,
-- Peter

tmart
08-10-2010, 04:12 PM
Another scheduling idea:

For server overrides: Allow a list of servers (eg. "srv1,srv3,srv5")

Add a couple of new controls: "Start on same server as..." (a pulldown with a list of the components in the app). And also a "DO NOT start on same server as..."

This would allow something like:

IN_one --> WEB5_one {on one server}
=================
IN_two --> WEB_two {on another server}

By setting WEB5_one.start_on_same_server_as = IN_one
and: WEB5_two.start_on_same_server_as = IN_two
and: IN_two.DO_NOT_start_on_same_server_as = IN_one

If these rules were evaluated collectively by your scheduler, then I think we could more accurately create HA configurations without having to pin to specific servers that might not be appropriate for a given grid without altering the configuration.

tmart
08-10-2010, 04:13 PM
i guess there were two ideas in there

PeterNic
08-12-2010, 10:23 AM
Server overrides: yep, possible; but I think the co-location controls would be better.

Maybe we can look at these as:
- colocate group: list of appliances to try to run it on the same server as (emphasis on try -- should this be a hard requirement or soft?)
- separate group: list of appliances ato try to keep separate from

Ideally, we should use assemblies to create groupings -- so you would be able to say, colocate all appliances from this assembly on the same server; and just add the two assembly instances in the current failover groups in order to separate the groups of appliances. This will likely keep the process intuitive and reduce the likelihood of errors (e.g., forgetting to add something to a colocate/separate group). Does this work?

Best regards,
- Peter

tmart
08-16-2010, 08:25 AM
Peter, here's some feedback.

The general case should not limit the "colocate group" to just a single server. There are likely two reasons for using "colocate group":


HA
reduced network usage by keeping chatty pairs on the same physical machine


The HA scenario is the one that I'm trying to address. In this scenario, the bottom line is knowing that you have two stacks that are sufficiently separated from one another so that you don't have single points of failure built-into your processing chains. One physical machine failing should not cause an entire app to fail that was designed from a logical perspective to provide two redundant paths. As such, we might need to say "start set A of components on set X of servers" and (within the same app) say "start set B of components on set Y of servers" and we can then make sure that sets X and Y are disjoint. But the pool of resources could possibly be bigger than a single machine. Allowing the server override to be a list of servers rather than a single server goes a long way to making this possible.

The "separate group" and "colocate group" should probably work with each other. We spoke about doing this a while ago and I think I called this the "inclusion group" (given that the current HA failover group I interpreted as an "inclusion group"). Now I'm not sure that this makes a lot of sense given the obvious interdependency between the two group concepts.

Using assemblies could work for this; however, I think that it would limit the use of assemblies. For example, in a complicated application architecture I imagine that there are several reasons why people would use assemblies today:


Simply because the GUI is too cluttered to fit everything at one level
To create a very wide single layer (exactly like WEB5x4) of a multi-layered app
To encapsulate/bury detailed functionality of a complex app
To provide a good hierarchical tool for property definitions


But if you implement this HA functionality only via assemblies, it will probably limit the use of assemblies to that special purpose. The other side effects of using assemblies such changing the component path from the controller might not seem intuitive as a result of simply wanting to specify which components to run separately.

-- Tim