A high performance computer (HPC) is a built from many ordinary computers connected together with a network and usually controlled by a special software in charge of assign resources to users. This is a special case of distributed-memory computing environment. The examples of the Cloud Haskell available in both references use a controlled local cluster of computers and cloud platforms (Infrastructure as a Service type-of) to execute the programs.
In this blog I show who to execute Cloud Haskell applications in a HPC Supercomputer that use a job system software to manage resources (compute nodes).
How I do it
I work as Supercomputer System Admin of a node called Altamira  in the Instituto de Física of Cantabria (IFCA). This supercomputer is composed of almost 250 servers with 16 real cores, 64 GB of RAM of memory and an Infiniband connection. All the system is managed with the Slurm resource manager, tuned with a few scripts for the RES-Red Española de Supercomputación.
I don't want to replicate all the tutorials again, I only want to execute one again of them but test if I can execute in a Supercomputer like Altamira. So I decided to use only the WorkStealing example and implemented it again in a local project.
The Slurm system required that all the programs executed in a cluster should describe the resources they want. With this description (usually in a file, called job description file), the manager enqueue the jobs until it found free space to execute in. This file is also a script file that will be interpreted by a shell (the Bash shell in my case), with a series of directives to inform the batch system about the characteristics of the job. These directives appear as comments in the job script. Example, a simple job that requires to use 32 cpus should have this text:
#!/bin/bash #@ job_name = cloud_haskell #@ initialdir = . #@ output = cloud_haskell_%j.out #@ error = cloud_haskell_%j.err #@ total_tasks = 32 #@ wall_clock_limit = 1:00:00 echo "start running"
One we get the reserved cpus, we should split them in one for the server and (n-1) for the slaves. We can do it easily using the variables that Slurm define for our job. Then we use the Slurm program srun to spawn the (n-1) slaves in background in the right servers and then we spawn the master program in the last one. We don't know the names of the nodes assigned, nor how many programs per node should be spawned, srun know it to spawn the slaves:
srun --exclusive -c1 -n$((SLURM_NTASKS-1)) run-client.sh sleep 30 srun --exclusive -c1 -n1 run-server.sh
The program srun call two scripts that stars the Cloud Haskell program instances. In this scripts, using the Slurm variables that srun set, we can pass the right parameter to the application. The client/slave script is (file run-client.sh):
#!/bin/bash HOST=`hostname` ~/.cabal/bin/demo-worksteal slave $HOST $((SLURM_LOCALID + 7766))
And for the master (file run-server.sh):
#!/bin/bash HOST=`hostname` ~/.cabal/bin/demo-worksteal master $HOST $((SLURM_LOCALID + 7866)) 1000
For a unknown (for me) reason, the application starts with a awful block buffering. I made the change to line buffering in order to control the output during the execution.
main = do hSetBuffering stdout LineBuffering args <- getArgsWith the a job system, I found the problem with several simultaneous jobs in execution, that one master could steal the slaves of other jobs. In order to solve this I pass a list of assigned nodes to the master task. In the job script, we only need to add:
srun --exclusive -c1 -n1 run-server.sh $SLURM_NODELISTAnd in the Haskell program, we filter the founded slaves using the assigned list of nodes for that job:
["master", host, port, assigned, n] -> do backend <- initializeBackend host port rtable startMaster backend $ \slaves -> do let used = filterNodes slaves assigned result <- master (read n) used liftIO $ print result
Unresolved problemsAfter it worked for the first runs, I start to ask for more slaves (Ha ha Ha). But the jobs begin to output this error:
addMembership: failed (Unknown error 18446744073709551615)Maybe is a problem of ports already binded, or a timeout during creation, I don't know, but a better error feedback can be great. I also have missed a way to start processes without specifying the port, that the package search for the first free port in a desired range. Also, several times I needed the server name to be obtained from the NodeId type. I implemented a coarse function to get it. I think the functionality should be put on distributed-process package.
nidNode :: NodeId -> String nidNode nid = case regres of [_:nn:_] -> nn _ -> "*" where regres = show nid =~ "nid://(.+):(.+):(.+)" :: [[String]]
Last wordsThis is a example of executed job, using 64 cores, 63 slaves spawned in four servers and 1 master:
START Master on: node60 7866 + [node60:7866] 63 slave/s + [node60:7866] -> node57 = 16 slaves + [node60:7866] -> node58 = 16 slaves + [node60:7866] -> node59 = 16 slaves + [node60:7866] -> node60 = 15 slaves 1001000 END MasterI found the Cloud Haskell good enough (I doubt they should call it, currently is more distributed than cluoded ). And It add a practical alternative to the kind of distribution frameworks usually seen in HPC clusters. The performance should be tested with real problems, but, again, for user accustomed to the benefits of Haskell is a good start.