OBS scheduler internals

From Tizen Wiki
Jump to: navigation, search


Introduction

This document describes algorithm of an OBS-2.7 scheduler. Scheduler determines the order in which packages are to be built.

OBS scheduler uses external package "perl-BSSolv"[3] for core operations like building of package dependencies, expanding dependencies, sorting packages, etc. In turn, "perl-BSSolv" uses "libsolv"[2] which is a package dependency solver library based on satisfiability algorithm.

Sometimes in this document we will refer to "perl-BSSolv" just as BSSolv.

To make future description clearer we need to define two types of dependencies between packages:

  1. Build dependencies. They are expressed by "BuildRequires" tag in spec file or "Required" one in a project configuration. At the point of a build system it is just a set of packages that are required for building a given package
  2. Install dependencies. At the point of RPM this denotes packages or capabilities to be installed before the given package. They could be obtained via rpm -qp -requires pkg_name.
  3. In fact, there are two ways of adding dependency information to a package:

    1. Automatic dependencies. They are obtained by running ldd on every executable in a package's %files list. In current versions of RPM is used rpmdeps: rpmdeps --define="_use_internal_dependency_generator 1" -requires
    2. Manual dependencies. It corresponds to the case of usage "Requires: foo" in a package's spec file.

Scheduler file structure

Below is shown the file structure of OBS scheduler starting at the root of "open-build-service" git repo [1].

  1. src/backend/bs_sched
  2. src/backend/BSSched/
    1. Access.pm
    2. BuildJob.pm
    3. BuildRepo.pm
    4. BuildResult.pm
    5. Checker.pm
    6. DoD.pm
    7. EventHandler.pm
    8. EventQueue.pm
    9. Lookat.pm
    10. ProjPacks.pm
    11. PublishRepo.pm
    12. RPC.pm
    13. Remote.pm
    14. RepoCache.pm
    15. BuildJob/
      1. Aggregate.pm
      2. Channel.pm
      3. DeltaRpm.pm
      4. Import.pm
      5. KiwiImage.pm
      6. KiwiProduct.pm
      7. Package.pm
      8. Patchinfo.pm
      9. PreInstallImage.pm
      10. SimpleImage.pm
      11. Unknown.pm
      12. Upload.pm
      13. EventSource/
        1. Directory.pm
        2. RemoteWatcher.pm
        3. Retry.pm

Table 1: Scheduler file structure.

Being installed on a system scheduler code is located at /usr/lib/obs/server. In the text below we will often refer to a different perl modules. Some modules are not part of "open-build-service" repo and come in separate packages: "perl-BSSolv", "obs-build". To get correspondence between module and its file use the following table:

Module name
File path

Checker

BSSched/Checker.pm

Build

Build.pm in "obs-build" git repository [4].

BSSolv, Pool

BSSolv.xs in "perl-BSSolv" git repository [3].

Package

BSSched/BuildJob/Package.pm

Aggregate

BSSched/BuildJob/Aggregate.pm

Patchinfo

BSSched/BuildJob/Patchinfo.pm

Table 2.

Test project

To get better understanding about different aspects of scheduler algorithm we will use a test project with two modifications: with and without cyclical dependencies (see Fig. 1). There are no hidden interconnections between packages, so their names could be arbitrary. As an example packages in this test project are connected via "BuildRequires" relation:

Image001.png

Figure 1: Example dependency graphs. Dashed line exists only for "cycle" case.

Each package builds the same "Hello, World" application. All sources are placed in one package - "A". Remained packages are just links to the "A":

osc linkpac Tizen:Base:TestBuildOrder A Tizen:Base:TestBuildOrder AA -f
osc linkpac Tizen:Base:TestBuildOrder A Tizen:Base:TestBuildOrder AB -f
...
osc linkpac Tizen:Base:TestBuildOrder A Tizen:Base:TestBuildOrder CC -f

Test project uses "Tizen:Base" one to get needed build environment:

<repository name="ia32">
   <path project="Tizen:Base" repository="ia32"/>
   <arch>i586</arch>
</repository>

Scheduler working scheme

Image002.png

Figure 2: Scheduler working scheme. Green arrows mean routine call; dashed line from one block denotes exclusive alternative exection flow.

Below are described more deeply the steps from the diagram.

The scheduler communicates with other parts of OBS through the set of events. Each event is just an XML file that contains its type, target project, architecture, etc. There is a handler for each type of event. Full list is shown below. For more details see BSSched/EventHandler.pm.

'built'
'uploadbuild'
'import'
'srcevent'
'package'
'project'
'projevent'
'lowprioproject'
'repository
'repoinfo'
'rebuild'
'recheck'
'admincheck'
'unblocked'
'relsync'
'scanrepo'
'scanprjbinaries'
'dumprepo'
'wipenotyet'
'wipe'
'exit'
'exitcomplete'
'restart'
'dumpstate'
'useforbuild'
'configuration'
'suspendproject'

For instance, the following actions trigger events: new package is built, change of package sources/project configuration, wipe binaries, etc.

The diagram shows iteration of algorithm after receiving an event.

0. Set up global context.

Global context (gctx) is a structure that contains all data of the current scheduled project, namely: its configuration, chain of repositories, parsed data for each package, readiness status of repos/packages, so on.

Get from Source server info about each package in a project. It includes: build dependencies obtained from spec file, md5 sum over package sources/meta and so on.

On source server side it corresponds to the following call chain:

getprojpack() //bs_srcserver
   Build::parse_typed(proj_conf, spec_file, buildtype)
      Build::Rpm::parse()

1. Checker::new.

Create new "Checker" object. It will be responsible for checking the status of a project's repository.

At this step "Checker" parses project configurations of a given project and all its ancestors, i.e. projects that are required by the given one.

2. Checker::preparepool.

Create new "Pool" object using "perl-BSSolv" library. Pool is a special data structure to store package repositories that has fast retrieval operations

3. Checker::addrepo.

For a given project and all its ancestors add to the pool their repositories. Architecture of a repository to be added is determined by scheduler architecture. By design, there should be only one scheduler object per architecture registered in OBS.

4. Pool::createwhatprovides.

Create list of what capabilities each package provides. For instance, as a capability can be a package name, shared library's soname and even arbitrary character string.

5, 6. Expander::new; Expander::expand.

Create new "Expander" object and register expand handlers which are in "perl-BSSolv" library. Expander is responsible for finding transitive dependencies for a given package(s).

For the details of expanding algorithm see section [6].

build::expand()
  expander::expand()

7. BSSolv::depsort.

Determine and break cycles in a dependency graph if any. Then make topological sort for project packages. To do this, depsort uses a package's expanded dependency list obtained from the previous step. As an example, below is shown the result of depsort for our test project for cyclical and no cyclical cases. It turns out that exactly for our test project the order is the same in both cases.

Cyclic
No cyclic

A

AA

AB

B

BB

C

CC

A

AA

AB

B

BB

C

CC

Table 3: depsort-ed packages

In the figure below packages are order after depsort. If we omit dash line which makes a cycle we get classical definition of topologically sorted oriented graph.

Image003.png

Figure 3: Packages order after depsort.

Notice that in case of cyclical dependencies packages "AA" and "BB" are not contained in a cycle. It could be reasonable in this situation to block these packages until packages in a cycle become ready and then enable their build. I.e. "AA" and "BB" should be built only one time. But source code and experiments show this is not the case. For instance package "AA" is built twice. Here is an example of build jobs start time sequence for the test project: A, AB, B, C, AA, CC, B, A, BB, AA.

Theoretically, this behavior could significantly slowdown a build process.

8. Checker::checkpkgs.

At this step, we need to decide for each package is it possible start its build job or not.

Source code: checkpkgs() in BSSched/Checker.pm; check() in BuildJob/Package.pm; For details see section [7].

Expanding dependencies

An algorithm of expanding dependencies is "distributed" to some degree, because some parts are contained in the code of "Source server"(bs_srcserver) while others in the code of bs_sched.

Suppose that we need to get expanded dependencies for a package pkg.

Generally, the algorithm has three main steps:

  1. Finding explicit dependencies of pkg, obtained list is deps_list. Getting the list of packages that will be used as a filter on step 3. Obtained list is bdeps_list (see details in 1.3).
  2. Finding transitive dependencies for each package of deps_list. Obtained here list is edeps_list.
  3. Deleting bdeps_list from edeps_list.


Consider each step in details.

  1. Finding explicit dependencies.
    1. Get from pkg's spec file packages denoted by "BuildRequires:" and add to deps_list. This step is performed by "Source server". Notice that "Requires" packages in spec file are not considered because this tag corresponds to the install dependencies. (parse() in /usr/lib/build/Build/Rpm.pm).
    2. Get from project configuration packages denoted by "Required:" and add to deps_list. (get_deps() in /usr/lib/build/Build.pm).
    3. Get from project configuration packages denoted by "Preinstall" and "Support" and add to bdeps_list. Why exactly these two groups? "Preinstall" as well as "Support" packages do not trigger rebuild in case of changing sources. See description of these tags here [8]. (get_deps() in /usr/lib/build/Build.pm)
    4. Delete deps_list from bdeps_list. (get_deps() in /usr/lib/build/Build.pm)
  2. Finding transitive dependencies. Here is used expand function from "perl-BSSolv" package. In turn, "perl-BSSolv" is based on "libsolv" library. Expand accepts deps_list and returns transitive install dependencies. This function as well as whole "perl-BSSolv" package is written in "C". What this function does is just traverses graph of dependencies via some kind of Breadth-first Search algorithm.
  3. Deleting bdeps_list from edeps_list. (get_deps() in /usr/lib/build/Build.pm)

Below is shown the trace of this algorithm on a simple test project. The structure of this project differs from the one described in section [4], though sources are the same. Prebuilt packages and project configuration are taken from Tizen:Base.

Image004.png

Figure 4.

Here are standard lists of packages that we will use below.

1.std_deps =

binutils
gcc
glibc
rpm-build
libtool
gcc-c++

2. std_bdeps = //notice this list doesn't contain glibc

bash
build
build-compare
build-mkbaselibs
bzip2
coreutils
cpio
cpp
diffutils
elfutils
file
filesystem
findutils
gawk
glibc-locale
grep
gzip
hostname
less
libacl
libatomic
libattr
libbz2
libcap
libelf
libfreebl3
libgcc
libgomp
libitm
liblua
liblzma
libmagic
libmagic-data
libncurses
libpopt
libreadline
libsmack
libsoftokn3
libsqlite
libstdc++-devel
libxml2
m4
make
net-tools
nspr
nss
patch
perl
pkg-config
rpm
rpmlint-mini
rpmlint-tizen
sed
setup
smack
tar
tzdata
update-alternatives
util-linux
which
zlib

3.std_edeps =

autoconf
automake
binutils
bzip2
coreutils
db4
filesystem
findutils
gcc
gcc-c++
glibc
glibc-devel
gzip
libacl
libatomic
libattr
libbz2
libcap
libcilkrts
libelf
libfreebl3
libgcc
libgomp
libitm
libltdl
liblua
liblzma
libmagic
libmagic-data
libpopt
libsoftokn3
libsqlite
libstdc++
libtool
libubsan
linux-glibc-devel
m4
make
nspr
nss
nss-certs
patch
perl
rpm
rpm-build
setup
tar
xz
zlib

4. std_final_edeps =

binutils
gcc
glibc
rpm-build
libtool
gcc-c++
libstdc++
libcilkrts
libubsan
glibc-devel
xz
libltdl
automake
linux-glibc-devel
autoconf
nss-certs
db4

pkg: "A"

1.1: deps_list = null;
1.2: deps_list = (std_deps);
1.3: bdeps_list =(glibc, std_bdeps);
1.4: bdeps_list = (std_bdeps);
2: edeps_list = (std_edeps);
3: edeps_list = (std_final_edeps);

pkg = "AA"

1.1: deps_list = (A); //because of BuildRequires: A
1.2: deps_list = (A, std_deps);
1.3: bdeps_list = (glibc, std_bdeps);
1.4: bdeps_list = (std_bdeps);
2: edeps_list = (A, std_edeps);
3: edeps_list = (A, std_final_edeps);

pkg = "AB"

1.1: deps_list = (AA) //because of BuildRequires: AA
1.2: deps_list = (AA, std_deps);
1.3: bdeps_list = (glibc, std_bdeps);
1.4: bdeps_list = (std_bdeps);
2: edeps_list = (A, AA, std_edeps);
3: edeps_list = (A, AA, std_final_edeps);

pkg = "B"

1.1: deps_list = (AB); //because of BuildRequires: AB
1.2: deps_list = (AB, std_deps);
1.3: bdeps_list = (glibc, std_bdeps);
1.4: bdeps_list = (std_bdeps);
2: edeps_list = (A, AA, AB, std_edeps);
3: edeps_list = (A, AA, AB, std_final_edeps);

Below is a graphical representation of expanded dependencies for each package. On the other hand, it gives all build dependencies of a package, i.e. those change in any of them will trigger a build.

Here are not shown "standard" dependencies like gcc, binutils, etc.

Image005.jpg

Figure 5.


The second example is pretty the same, the only difference is in project configuration.

We added there "Support: AA" clause. Let's see what is changed.

pkg = "A"

1.1: deps_list = null;
1.2: deps_list = (std_deps);
1.3: bdeps_list = (AA, glibc, std_bdeps); //AA is added because of "Support: AA"
1.4: bdeps_list =(AA, std_bdeps);
2: edeps_list = (std_edeps);
3: edeps_list = (std_final_edeps);

pkg = "AA"

1.1: deps_list = (A); //because of BuildRequires: A
1.2: deps_list = (A, std_deps);
1.3: bdeps_list = (glibc, AA, std_bdep);
1.4: bdeps_list = (AA, std_bdeps);
2: edeps_list = (A, std_edeps);
3: edeps_list = (A, std_final_edeps);

pkg = "AB"

1.1: deps_list = (AA); //because of BuildRequires: AA
1.2: deps_list = (AA, std_deps);
1.3: bdeps_list = (AA, glibc, std_bdeps);
1.4: bdeps_list = (''std_bdeps''); //glibc and AA have been deleted
2: edeps_list = (A, AA, std_edeps);
3: edeps_list = (A, AA, std_final_edeps);

pkg = "B"

1.1: deps_list = (AB); //because of BuildRequires: AB
1.2: deps_list = (AB, std_deps);
1.3: bdeps_list = (AA, glibc, std_bdeps);
1.4: bdeps_list = (AA, std_bdeps);
2: edeps_list = (A, AA, AB, std_edeps);
3: edeps_list = (A, AB, std_final_edeps); //notice, AA is removed from edeps_list

Below is a graphical representation of expanded dependencies in this case. After adding "Support: AA" to project configuration package "B" doesn`t depend anymore on "AA" one.

Image006.jpg

Figure 6.

Packages building order

At the high level of view, the order of building packages depends on the following things:

  1. Packages sorting order (the result of depsort function).
  2. Graph of expanded dependencies. Presence of dependency cycles.
  3. The results of check function called for each package.
  4. The number of OBS workers and their workload.

Notice that OBS scheduler doesn't build packages itself, it just issues the sequence of build jobs. In turn, OBS dispatcher is responsible for assigning each build job to a free OBS worker. What we are interested in is the order of build jobs in this sequence. Conditions listed above are changed dynamically. Thus, there is no way to predict exact building order in the future.

We can characterize scheduler working algorithm as iterative. When package building is finished the results are transferred to a binary repository. This triggers "built" event for a scheduler. Since the "World state" has been changed, scheduler recalculates what packages are ready to start building and what packages should be blocked.

At the first we consider check function and then describe an algorithm of build jobs ordering.

As an example we consider check function for ordinary package (with spec file). Its source code is in BuildJob/Package.pm.

For the future explanation we need to define two objects:

  1. "lastcheck" data of a package. It's just a string with the following structure: srcmd5.metamd5.hdrmetamd5.statdata (32+32+32+x bytes), where:
    • srcmd5: md5 hash over package sources
    • metamd5: md5 hash over all meta files of expanded dependencies
    • hdrmetamd5: md5 hash over the package header and metamd5of its expanded dependencies
    • statdata: system status of the meta file
  2. Package's "meta" (pkgname.meta). Here is an example for package "B":
    'cf880ef4e7606b6fd662266b2ac174b1  B',
    '14239bea3be4d25e1f6063c7f7a3722f  autoconf',
    '10274416185093ebc95a387395d675d0  automake',
    '176a1f85823404f916bd6757a472b1e8  binutils',
    '929c5b5f00a6906cfec8988541855d1c  db4',
    'e41cd9e9ce0b7c6d7b51003a8c29781b  gcc',
    '635f6acd1a621293927aa5e87b154eea  gcc-c++',
    'bb2582224e661ac4f8cf33fd02d58800  glibc',
    '72fabfd30630826a281e24896f0e2339  glibc-devel',
    '665ddec68b0f99b17e000eeb4b118282  libcilkrts',
    '0abc8373d3d2e432c1709415ed89c6e1  libltdl',
    '3accf2eafcd220bf9afcce054cc1857a  libstdc++',
    '4257a16a060cd87fd19d41549cbdcc93  libtool',
    'ac9a32e3fa1496fea6def186b7863bab  libubsan',
    '403435c4b84de4339c1c9e9b2584dd73  linux-glibc-devel',
    'cf2a7965637e791bdbb9ce8faee693e8  nss-certs',
    'd823bd861c0d0484d581f1be59870977  rpm-build',
    'f1d43ffd21abb0320685b139111a0044  xz',
    'a4b629063437fb4927f575ef7f1429ed  B/AB'
    

    For each package in B's edeps list there is an appropriate line that contains md5 hash over a package header and the payload (gzip-ed cpio archive).


These two objects play core role in the steps below.


Each package in a project goes through the following check steps:

  1. If the "meta" of the current package doesn't exist return "scheduled". Retrieve the former lastcheck object. If it doesn't exist or its statdata field doesn't match status of "meta" file, then create new lastcheck object base on "meta" file. Compare srcmd5 of a package with srcmd5 field of a lastcheck object. If they differ return "scheduled". If there is no changes go to 2.
  2. Calculate "check" object:
    my $check = substr($mylastcheck, 32, 32);
    my $pool = $ctx->{'pool'};
    my $dep2pkg = $ctx->{'dep2pkg'};
    $check .= $rebuildmethod;
    $check .= $pool->pkg2pkgid($dep2pkg->{$_}) for sort @$edeps;
    $check = Digest::MD5::md5_hex($check);
    

    The most important here is that pkg2pkgid function returns md5 hash over the package header and the payload. Hence, this object signal about changes in expanded dependencies if any.

    Compare "check" object with hdrmetamd5 of the lastcheck object. If they differ go to 3. If not, return "done".

  3. For the given package create new "meta" with the BSSolv::gen_meta() function and calculate its md5 hash and then compare it with metamd5 part of the lastcheck object. If they differ return "scheduled". If not, return "done".


Let's consider an algorithm of build jobs ordering.

Used variables:

  1. my @cpacks; Contains sorted (by depsort) packages of a project. For example: ('A', 'AA', 'AB', 'B', 'BB', 'C', 'CC')
  2. my $packid; Current package name.
  3. my %packstatus; Hash that contains packages' current status. Status can be one of: "scheduled", "locked", "blocked", "broken", "disabled" and "done".
  4. my %cychash; Hash that for a given package contains all packages of its cycle. It's used only for cycle packages. For example:
    %cychash = {
       'AB' => [
                   'A',
                   'AB',
                   'B'
                  ],
        'A' => [
                   'A',
                   'AB',
                   'B'
                  ],
        'B =>  [
                   'A',
                   'AB',
                   'B'
                  ],
    
    };
    

  5. my %cycpass; Hash that contains package's pass (could be one of 1, 2 or 3). It's used only for cycle packages. For example:
    %cycpass = {
       'AB' => 1,
       'B' => 1,
       'A' => 1
    };
    

  6. my %building; Hash that contains package's build job id. Example:
    %building = {
       'AB' => 'Tizen:Base:TestBuildOrder::ia32::AB-b50bb0a7c466b6da8fcbcbcd7c17e237',
       'A' => 'Tizen:Base:TestBuildOrder::ia32::A-e1746808d6a563a207df075891ad4556'
    };
    

  7. my %notready; Hash that contains "scheduled" or "blocked" packages. Example:
    %notready = {
       'A' => 1,
       'AB' => 1,
       'AA' => 1
    };
    

  8. my @blocked; An array that for a given package contains intersection of its expanded dependencies and <code">%notready</code>.
  9. </ol>


    Below is simplified version of an algorithm that creates build jobs in proper order. As an input it takes depsort-ed packages of a project.

    Clear %packstatus, %cycpass, %building

    Make %cychash;

    while(@cpacks))
    {
       my $packid = shift @cpacks; //get first package
       my $incycle = 0;
    
       if ($cychash{$packid}) {
          next if $packstatus{$packid} && $packstatus{$packid} ne 'done';
    
    

    Cycle packages require additional processing. We look at a cycle two times:

    1. Just trigger package builds caused by source changes.
    2. Normal package build triggering caused by dependencies.

    Thus each cycle package goes through the two phases.

    Calculate phase 1 packages. Get packages that belong to the cycle of $packid and was not already considered.

          my @cnext = grep {!$cycpass{$_}} @{$cychash{$packid}};
          if (@cnext) {
    

    Still phase 1 packages left, do them first. Prepend $packid to the @cpacks so that we can process all packages that belong to its cycle.

             unshift @cpacks, $packid;
             $packid = shift @cnext;
             $cycpass{$packid} = 1; # now doing phase 1.
             $incycle = 1;
          } elsif (($cycpass{$packid} || 0) < 2) {
             # enter phase 2
             $cycpass{$packid} = 2;
             my $pass = 2;
    

    We are building packages because of source changes. Set %cycpass to 3 so that we don't start other builds.

             $pass = 3 if grep {$building{$_}} @{$cychash{$packid}};
             $cycpass{$_} = $pass for @{$cychash{$packid}};
          }
       }
    
       if (!$incycle) {
          my $job = BSSched::BuildJob::jobname($prp, $packid)."-$pdata->{'srcmd5'}";
          my $myjobsdir = $gctx->{'myjobsdir'};
    

    Check that build job is already exists.

       if (-s "$myjobsdir/$job") {
          $building{$packid} = $job;
          $notready->{$pname} = 1 if $useforbuildenabled;
          $packstatus{$packid} = 'scheduled';
          next;
       }
    

    Get expanded dependencies of a package.

      my $edeps = $ctx->{'edeps'}->{$packid} || [];
    

    Calculate if package is blocked by its expanded dependencies. Here we filter $edeps list through the %notready one.

       my @blocked = grep {$notready{$_}} @$edeps;
    
       if (%cychash{$packid}) {
          my $cycpass = %cycpass{$packid} || 0;
          if (@blocked && $cycpass == 3) {
    

    $cycpass == 3 means that packages of this cycle are building because of source changes. Block $packid<code> until packages from <code>@blocked become ready.

             $packstatus{$packid} = ‘blocked’;
             $notready->{$pname} = 1 if $useforbuildenabled;
             next;
          }
          my %cycs = map {$_ => 1} @{$ctx->{'cychash'}->{$packid}};
    

    Prune building cycle packages from blocked.

         @blocked = grep {!%cycs{$_} || !$building{$_}} @blocked;
    

    In other words, here we ignore those packages in @blocked that belongs to a cycle and still are building.

    
          if (@blocked) {
             $packstatus{$packid} = ‘blocked’;
             $notready->{$pname} = 1 if $useforbuildenabled;
             next;
          }
       }
     
       if(check_1($packid) eq  'scheduled') {
          goto BUILD;
       } else {
          $packstatus{$packid} = ‘done’;
          next;
       }
    
       if ($incycle != 0) {
          $packstatus{$packid} = ‘done’;
          next;
       }
    
       if(check_b($packid) eq  'scheduled') {
          goto BUILD;
       } else {
          $packstatus{$packid} = ‘done’;
          next;
       }
    
       if( check_c($packid) eq  'scheduled') {
          goto BUILD;
       } else {
          $packstatus{$packid} = ‘done’;
          next;
       }
    
    BUILD:
    
       if (packstatus{$packid} eq 'scheduled') {
    

    Create a new build job. Check if this job is already building, if yes, do nothing. Otherwise calculate and expand build dependencies, kill all other jobs of the same prp/package, write status and job info.

          my $job = BuildJob::create($ctx, $packid, $pdata, $info, $subpacks, $edeps, $reason, $needed);
          $building{$packid} = $job;
          $packstatus{$packid} = 'scheduled';
          $notready->{$pname} = 1 if $useforbuildenabled;
       }
    }
    

    Putting it all together

    In this section we will try to put together that said above. Using test project from the section [4] we will see which steps OBS Scheduler goes through in order to build entire project.

    At first, let's describe scheduler algorithm from the high level.

    At each time project is determined by its state: disjoint sets of the following groups of packages:

    1. Packages that are blocked by another packages.
    2. Packages that are scheduled, i.e. ones that are waiting to be dispatched on a free workers.
    3. Building packages.
    4. Ready packages.

    An algorithm is essentially iterative. Subsequent project state is determined by its previous state and new build results. Transition from current state to the next one is triggered by an event, see [5].

    Below is described what scheduler does on each iteration.

    1. Sort project packages in topological order.
    2. Determine build dependencies for each package.
    3. Check each package is it ready to get built or not. Cycle packages are processed slightly different way from non-cycled ones. Cycle package is checked two times (that corresponds to two phases). First time: package build is triggered by source changes. Second one: normal package build (triggered by change of a "meta"). For each package is calculated @blocked list, i.e. packages that block building of a given one. @blocked list is pruned in case on cycle packages: are removed packages that belong to the same cycle and still are building. For both types of packages are compared last build results with former ones to figure out are they differ or not.


    Trace of a test project.


    Below are initial values of some variables:

    %cychash = {
       'AB' => [
                    'A',
                    'AB',
                    'B'
                  ],
       'A' =>  [
                    'A',
                    'AB',
                    'B'
                  ],
       'B' =>  [
                    'A',
                    'AB',
                    'B'
                  ]
    };
    

    %edeps = {
       AA => [ A ],
       B =>  [ AB],
       CC => [ C ],
       A =>  [ B ],
       C =>  [ ],
       AB => [ A ],
       BB => [ B ]
    };
    

    In the description below each event has: a table with context parameters that describe package state; graphical representation of a project after recalculating new state.

    Context parameters table.

    • Each row corresponds to a call of check function. For a cycle packages this function may be called twice. Columns correspond to parameters from the section [7].
    • "Result" column contains result of check call. "@blocked" column correspond to parameters that block building of a given package.
    • Cells marked in pink means that line containing this cell represents cycle packages.
    • Special syntax: "(-pkg1), (-pkg2),..., (-pkgN)" means that from "@blocked" additionally were removed packages pkg1, pkg2,..., pkgN.
    • If row is gray this means that check is actually not called for a specified package. This may happen only with a cycle packages:

    while(@cpacks)) {
    ...
    
       if ($cychash{$packid}) {
          next if $packstatus{$packid} && $packstatus{$packid} ne 'done';
    ...
    
    check($packid);
    

    It just means that check was already called for this package and currently it's still not ready. So, when reading these lines may be skipped.


    Graphical representation of a package state.


    Image007.png

    Notice that this test was carried out on OBS with a single worker.


    Event 1: source change (for instance package "A").


    pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    scheduled
    src change, start build
    A/1
    scheduled
    AA
    0
    A
    A
    blocked
    blocked by A
    AB/1
    1
    1
    (-A)
    A
    A, AA
    scheduled
    src change, start build
    AB/1
    scheduled
    B/1
    1
    1
    (-AB)
    A, AB
    A, AA, AB
    scheduled
    src change, start build
    B/1
    scheduled
    BB
    0
    B
    A, AA, AB, B
    blocked
    blocked by B
    C
    0
    A, AA, AB, B, BB
    scheduled
    src change, start build
    CC
    0
    C
    A, AA, AB, B, BB, C
    blocked
    blocked by C

    Scheduler has created build jobs for packages "A", "AB", "B" and "C". They are allowed to be built in parallel.

    Test OBS configuration includes only one worker, thus Dispatcher can assign any of them to a worker. It selected "B".

    Notice that cycle packages on phase "1" can be built in parallel.

    Image008.png


    Event 2: B.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    scheduled
    src change, start build
    A/1
    scheduled
    AA
    A
    A
    blocked
    blocked by A
    AB/1
    1
    1
    (-A)
    A
    A, AA
    scheduled
    src change, start build
    AB/1
    scheduled
    B/1
    1
    1
    (-AB)
    A, AB
    A, AA, AB
    done
    in cycle, no source change
    B/2
    3
    0
    AB
    A, AB
    blocked
    blocked by cycle builds (AB...)
    BB
    0
    B
    A, AA, AB, B
    blocked
    blocked by B
    C
    scheduled
    already scheduled
    CC
    0
    C
    A, AA, AB, B, BB, C
    blocked
    blocked by C

    Image009.png


    Event 3: A.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    AB/1
    1
    1
    scheduled
    src change, start build
    A/1
    done
    B/1
    1
    1
    (-AB)
    AB
    AB
    done
    in cycle, no source change
    A/2
    3
    0
    AB
    AB
    done
    nothing changed (looked harder)
    AA
    0
    AB
    scheduled
    src change, start build
    AB/1
    scheduled
    B/2
    3
    0
    AB
    AA,AB
    blocked
    blocked by cycle builds (AB...)
    BB
    0
    B
    AA, AB, B
    blocked
    blocked by B
    C
    0
    scheduled
    already scheduled
    CC
    0
    C
    AA, AB, BB, B, C
    blocked
    blocked by C

    Image010.png


    Event 4: AB.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    Done
    AB/1
    1
    1
    Done
    in cycle, no source change
    A/1
    done
    B/1
    1
    1
    done
    in cycle, no source change
    A/2
    2
    0
    done
    nothing changed
    AA
    0
    scheduled
    already scheduled
    AB/2
    2
    0
    AA
    AA
    done
    nothing changed (looked harder)
    B/2
    2
    0
    AA
    AA
    scheduled
    !AB
    !AB/A
    meta change, start build
    BB
    B
    B, AA
    blocked
    Blocked by B
    C
    0
    scheduled
    already scheduled
    CC
    C
    AA, B, C, BB
    blocked
    Blocked by C

    Image011.png


    Event 5: C.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    AB/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    B/1
    1
    1
    done
    in cycle, no source change
    A/2
    2
    0
    done
    nothing changed
    AA
    scheduled
    already scheduled
    AB/2
    2
    0
    AA
    AA
    done
    nothing changed
    B/2
    2
    0
    scheduled
    already scheduled
    BB
    0
    B
    A, BB
    blocked
    blocked by B
    C
    0
    B, AA, BB
    done
    nothing changed (looked harder)
    CC
    0
    B, AA, BB
    scheduled
    src change, start build

    Image012.png


    Event 6: AA.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    AB/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    B/1
    1
    1
    done
    in cycle, no source change
    A/2
    2
    0
    done
    nothing changed
    AA
    0
    done
    nothing changed (looked harder)
    AB/2
    2
    0
    done
    nothing changed
    B/2
    2
    0
    scheduled
    already scheduled
    BB
    0
    B
    B
    blocked
    Blocked by B
    C
    0
    B, BB
    done
    nothing changed
    CC
    0
    scheduled
    already scheduled

    Image013.png


    Event 7: CC.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    AB/1
    1
    1
    done
    in cycle, no source change.
    A/1
    done
    B/1
    1
    1
    done
    in cycle, no source change
    A/2
    2
    0
    done
    nothing changed
    AA
    0
    done
    nothing changed
    AB/2
    2
    0
    done
    nothing changed
    B/2
    2
    0
    scheduled
    already scheduled
    BB
    0
    B
    B
    blocked
    Blocked by B
    C
    0
    B, BB
    done
    nothing changed
    CC
    0
    B, BB
    done
    nothing changed (looked harder)

    Image014.png


    Event 8: B.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    AB/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    B/1
    1
    1
    done
    in cycle, no source change
    A/2
    2
    0
    scheduled
    !B/AB
    meta change, start build
    AA
    0
    A
    A
    blocked
    Blocked by A
    AB/2
    2
    0
    (-A)
    A
    A, AA
    done
    nothing changed
    B/2
    2
    0
    A
    A, AA
    done
    nothing changed (looked harder)
    BB
    0
    A, AA
    scheduled
    src change, start build
    C
    0
    A, AA, BB
    done
    nothing changed
    CC
    0
    A, AA, BB
    done
    nothing changed

    Image015.png


    Event 9: BB.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    AB/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    B/1
    1
    1
    done
    in cycle, no source change
    A/2
    2
    0
    scheduled
    already scheduled
    AA
    0
    A
    A
    blocked
    Blocked by A
    AB/2
    2
    0
    (-A)
    A
    A, AA
    done
    nothing changed
    B/2
    2
    0
    A
    A, AA
    done
    nothing changed
    BB
    0
    A, AA
    done
    nothing changed (looked harder)
    C
    0
    A, AA
    done
    nothing changed
    CC
    0
    A, AA
    done
    nothing changed

    Image016.png


    Event 10: A.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    AB/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    B/1
    1
    1
    done
    in cycle, no source change
    A/2
    2
    0
    done
    nothing changed (looked harder)
    AA
    0
    scheduled
    !A/B/AB
    meta change, start build
    AB/2
    2
    0
    AA
    AA
    done
    nothing changed (looked harder)
    B/2
    2
    0
    AA
    done
    nothing changed
    BB
    0
    AA
    done
    nothing changed
    C
    0
    AA
    done
    nothing changed
    CC
    0
    AA
    done
    nothing changed

    Image017.png


    Event 11: AA.rpm is ready.


    Pkg/
    (phase)
    $cycpass
    $incycle
    @blocked
    @building
    @notready
    $packstatus
    Result
    A/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    AB/1
    1
    1
    done
    in cycle, no source change
    A/1
    done
    B/1
    1
    1
    done
    in cycle, no source change
    A/2
    2
    0
    done
    nothing changed
    AA
    0
    done
    nothing changed (looked harder)
    AB/2
    2
    0
    AA
    AA
    done
    nothing changed
    B/2
    2
    0
    AA
    done
    nothing changed
    BB
    0
    AA
    done
    nothing changed
    C
    0
    AA
    done
    nothing changed
    CC
    0
    AA
    done
    nothing changed

    Image018.png

    References

    1. https://github.com/openSUSE/open-build-service
    2. https://github.com/openSUSE/libsolv
    3. https://github.com/openSUSE/perl-BSSolv
    4. https://github.com/openSUSE/obs-build
    5. https://github.com/openSUSE/build-compare
    6. http://openbuildservice.org/help/manuals/obs-reference-guide/
    7. https://en.opensuse.org/openSUSE:Specfile_guidelines
    8. https://en.opensuse.org/openSUSE:Build_Service_prjconf