Pool-wide policy enforcement
Overview
Condor's most powerful feature is its unlimited flexibility of resource allocation policies. While it defaults to relatively simple one, the real art of policy making can produce sophisticated policies, exactly fitted to the specific needs. Here are some examples of very complex policies forBologna batch system.
One of the problematic issues with policies is how to actually configure them. The policy settings in Condor
are distributed all over the pool's resources. Some policies require to set various configuration parameters
in submission machines, execution machines, and the matchmaker. Moreover, policy configuration of some
arbitrary host in the pool is possible only if one has permissions to perform remote configuration changes on
that host, or has direct access to the configuration file. Finally, most of the policies cannot be enforced
on those which will join the pool in the future, unless their owners reconfigure them accordingly.
This project focuses on providing centralized mechanism for pool-wide policy enforcement. It allows to
specify resource allocation policies in one single place, namely in the matchmaker. These policies
are enforced by the matchmaker during the matchmaking process, and as such cannot be circumvented or bypassed neither by the resource owners nor by the pool users.
There are several types of policies:
- Preventing submission machines or resources to join the pool, based on their properties. For instance, one can disallow all machines with RAM below 128MB, or too low-performance CPU to join the pool. This
can be handy if most of the jobs are known to have high memory and CPU requirements, but the users do not
bother specifying that in job requirements. In addition it can serve as a good sanity check to make sure that
all pool resources correctly report their characteristics.
- Preventing jobs to participate in matchmaking, based on their description. Similar to
the previous example, one can prevent jobs from entering the matching process to avoid intentional or unintended
pool abuse. One example of such policy can be disallowing the jobs submitted by some user to be restarted
more than 10 times. Such a restriction can make debugging of job failures much easier.
- Preventing specific matches, based on characteristics of both resources and jobs. This capability
allows to restrict jobs and resources with specified parameters from being matched, i.e. what job
can be invoked on what resource. It can be used to allow some group of users to utilize only low quality machines, or allow only
execution of some specific programs by those users on some resources.
Of course, this centralized pool-wide policy mechanism mechanism does not substitute the existing distributed policy mechanism, but can be used to override or enhance it. Coupled with logging, it can be utilized as an
auditing tool, allowing to log the specified matching events.
Solution design
We implemented the changes in the Negotiator. We utilize Classads mechanism
to specify policies. Every classad entering Negotiator (either from Collector, or directly from
Schedd
during negotiation) is checked against the appropriate policy expression. There are three of them: for Schedd classad
(NEGOTIATOR_PE_SCHEDD_ONLY_EXPR), for Startd (NEGOTIATOR_PE_STARTD_ONLY_EXPR), and for Job classad (NEGOTIATOR_PE_JOB_ONLY_EXPR.). If the expression is evaluates to true in the context of the given classad, it is discarded, i.e. that classad will not participate in matchmaking.
For a resource this will prevent job execution on it, and for a job this will prevent allocation of any resource.
To influence the matchmaking process and prevent specific matches, every job's and resource's Requirements expression in the
corresponding classad are ANDed with the policy expression of its type. (NEGOTIATOR_PE_RESOURCE_EXPR for a resource classad, and NEGOTIATOR_PE_JOB_EXPR for job classad). This allows to force the Requirement expression to evaluate to False during the matchmaking process, effectively preventing the match to occur.
The design alternatives and the exact description of the current design can be found here
Status
This project is complete. The changed version of Negotiator, based on the Negotiator v6.6.0 is available for download. The further development depends on user requests.
Download
Negotiator with policy enforcement (changes to condor_negotiator version 6.6.0,linux-glibc2.3,dynamic (tested on RH8).
To install - gunzip; substitute your working negotiator daemon with this one; add the policy expression to the configuration file for negotiator.
Example of configuration file
Contact
Mark Silberstein: marks-at-tx.technion.ac.il
Students
Genia Bouts, Sergey Guenender
Supervisors
Mark Silberstein
Prof. Assaf Schuster