The Internal Sweep Random Number Smart Connector

In this example, we create a smart connector with an internal parameter sweep. When this smart connector is executed, it spawns two tasks, each task generating a pair of random numbers. We assume the computation platform for this connector to be a cloud-based infrastructure. If two virtual machines (VMs) are available, each task will run on its own VM. When both tasks complete execution, their output will be transferred to a user-designated location. We call this smart connect the internal sweep random number smart connector.

  • The purpose of this example is to create a smart connector with an internal parameter sweep.
  • The source code for this example is available at chiminey/examples/randnuminternalsweep.

Requirements

  1. Installation and configuration of the Chiminey server on a virtual machine, according to the Installation Guide.
  2. Registration of a cloud computation platform, which is where the core functionality of a smart connector is executed within the Chiminey UI (see registering Cloud Computation Platform).
  3. Registration of a storage platform, which is the destination of the smart connector output within the Chiminey UI. As with other storage platforms, the platform could be any unix server, again including the Chiminey server itself (see registering Unix Storage Platform).

Creating the Internal Sweep Random Number Smart Connector

Here, we a create the internal sweep random number smart connector. For this, we need to carry out the following steps, in order:

  1. customise the parent stage to update the sweep map,
  2. prepare a payload
  3. define the smart connector with the new parent stage and the pre-defined core stages, and
  4. register the smart connector within Chiminey so it can be executed.

Customizing the Parent Stage

The customised parent stage, i.e., RandParent, is available at chiminey/examples/randnuminternalsweep/randparent.py.

  1. RandParent subclasses the core parent stage Parent, which is located at chiminey/corestages/parent.py. RandParent overwrites get_internal_sweep_map(self, ....) to include a new sweep map; the cross-product of the values of the parameters in the new sweep map is two.
  2. Here is the new sweep map that enables the execution of two tasks within a single job submission, {'var': [1, 2]}. var is an unknown parameter.

Below is the content of the RandParent class:

from chiminey.corestages import Parent

class RandParent(Parent):
    def get_internal_sweep_map(self, settings, **kwargs):
        rand_index = 42
        map = {'var': [1, 2]}
        return map, rand_index

Preparing a Payload

We now discuss how to prepare a payload for the internal sweep random number smart connector. This step is required because the computation platform of this smart connector is a cloud infrastructure and all cloud-based smart connectors must include their domain-specific executables in a payload.

NB: The payload for the internal sweep random number smart connector is available at chiminey/examples/randnuminternalsweep/payload_randnum.

  1. The Chiminey server expects payloads to be under LOCAL_FILESYS_ROOT_PATH, which is /var/chiminey/remotesys by default. A subdirectory can be created under LOCAL_FILESYS_ROOT_PATH to better organise payloads. On such occasions, the Chiminey server must be configured to point to the subdirectory. Let’s now create a subdirectory my_payloads, and then put payload_randnum under it.

    mkdir -p /var/chiminey/remotesys/my_payloads
    cp -r  /opt/chiminey/current/chiminey/examples/randnuminternalsweep/payload_randnum /var/chiminey/remotesys/my_payloads/
    
  2. As recommended in payload, payload_template is used as the starting point to prepare payload_randnum. In order to satisfy the requirements of this smart connector, start_running_process.sh will be changed.

    1. start_running_process.sh includes the logic for generating the random numbers. As expected by the Chiminey server, the output of the program is redirected to chiminey. Since this random generator is synchronous, the process ID is not saved. Here is the content of start_running_process.sh:

      #!/bin/sh
      python -c 'import random;  print random.random(); print random.random()'  >& chiminey/rand
      
    2. process_running_done.sh remains the same because the random number generating program is synchronous.

    3. start_bootstrap.sh and bootstrap_done.sh remain the same. This is because the random number generation depends only on python, and the included python in linux-based OS fulfills the requirement.

    4. start_process_schedule.sh and start_running_process.sh remain the same because there is no process-level configuration requirement.

Defining the Internal Random Number Smart Connector

The definition of this smart connector, i.e., RandNumInternaSweepInitial, is available at chiminey/examples/randnuminternalsweep/initialise.py.

  1. RandNumInternaSweepInitial subclasses CoreInitial, which is located at chiminey/initialise/coreinitial.py. RandNumInternaSweepInitial overwrites get_updated_parent_params(self), get_updated_bootstrap_params(self) and get_ui_schema_namespace(self).
  2. In the previous step, the parent stage is customised. Therefore, get_updated_parent_params(self) updates the package path to point to the customised parent stage class, which is chiminey.examples.randnuminternalsweep.randparent.RandParent.
  3. get_updated_bootstrap_params(self) updates settings to point the Chiminey server to the location of the new payload. The location of any payload is given relative to LOCAL_FILESYS_ROOT_PATH. Since we previously copied payload_randnum to LOCAL_FILESYS_ROOT_PATH/my_payloads/payload_randnum, the location of the payload is my_payloads/payload_randnum.
  4. The new get_ui_schema_namespace(self) contains three schema namespaces that represent three types of input fields for specifying the name of a cloud-based computation platform, the maximum and minimum number of VMs needed for the job, and an output location (see The Job Submission UI).

Below is the content of RandNumInternaSweepInitial.

from chiminey.initialisation import CoreInitial

class RandNumInternaSweepInitial(CoreInitial):
    def get_updated_parent_params(self):
        return {'package': "chiminey.examples.randnuminternalsweep.randparent.RandParent"}

    def get_updated_bootstrap_params(self):
        settings = {
                u'http://rmit.edu.au/schemas/stages/setup':
                    {
                        u'payload_source': 'my_payloads/payload_randnum',
                    },
            }
        return {'settings': settings}

    def get_ui_schema_namespace(self):
        RMIT_SCHEMA = "http://rmit.edu.au/schemas"
        schemas = [
                RMIT_SCHEMA + "/input/system/compplatform/cloud",
                RMIT_SCHEMA + "/input/system/cloud",
                RMIT_SCHEMA + "/input/location/output",
                ]
        return schemas

Registering the Internal Sweep Random Number Smart Connector within Chiminey

A smart connector can be registered within the Chiminey server in various ways. Here, a Django management command is used. chiminey/smartconnectorscheduler/management/commands/randnuminternalsweep.py contains the Django management command for registering the internal sweep random number smart connector. Below is the full content.

from django.core.management.base import BaseCommand
from chiminey.examples.randnuminternalsweep.initialise import RandNumInternaSweepInitial

MESSAGE = "This will add a new directive to the catalogue of" \
          " available connectors.  Are you sure [Yes/No]?"

class Command(BaseCommand):
    """
    Load up the initial state of the database (replaces use of
    fixtures).  Assumes specific structure.
    """

    args = ''
    help = 'Setup an initial task structure.'

    def setup(self):
        confirm = raw_input(MESSAGE)
        if confirm != "Yes":
            print "action aborted by user"
            return

        directive = RandNumInternaSweepInitial()
        directive.define_directive(
            'randnum_internal_sweep', description='RandNum Internal Sweep')
        print "done"


    def handle(self, *args, **options):
        self.setup()
        print "done"
  1. When registering a smart connector, a unique name must be provided. In this case, randnum_internal_sweep. If a smart connector exists with the same name, the command will be ignored.

  2. A short description is also needed. In this case, RandNum Internal Sweep. Both the unique name and the description will be displayed on the Chiminey UI.

  3. Execute the following commands on the Chiminey server terminal.

    sudo su bdphpc
    cd /opt/chiminey/current
    bin/django randnuminternalsweep
    Yes
    
  4. Visit your Chiminey web page; click Create Job. You should see RandNum Internal Sweep under Smart Connectors menu.

    The Internal Sweep Random Number Smart Connector

    Figure. The Internal Sweep Random Number Smart Connector

Testing the Internal Sweep Random Number Smart Connector

Now, test the correct definition and registration of the internal sweep random number smart connector. For this, you will submit a cloud random number smart connector job, monitor the job, and view the output of the job.

Submit an internal sweep random number smart connector job

See Job Submission for details.

An internal sweep random number smart connector job is submitted

Figure. An internal sweep random number smart connector job is submitted

Monitor the progress of the job

See Job Monitoring for details.

NB: Since the two tasks are internal to the job, they are not shown on the monitoring page.

An internal sweep random number smart connector job is completed

Figure. An internal sweep random number smart connector job is completed

View job output

Since this smart connector has two internal tasks, there will be two sets of outputs when the job is completed.

  1. Login to your storage platform
  2. Change directory to the root path of your storage platform
  3. The output is located under smart_connector_uniquenameJOBID, e.g. randnum_internal_sweep226