[GAP Forum] Multitasking with GAP
Alan Hylton
agh314 at lehigh.edu
Mon May 24 01:31:02 BST 2021
Howdy,
Thanks for the help!
Right now, I am generating lists of Lie algebras and I wish to calculate
the dimensions of H^2(G, G) and H^3(G, G) for each G. Since these are all
independent and time-consuming, I wanted to split them up. For now, the
return would just be a 2-dimensional list: [ dimension of H^2, dimension of
H^3 ].
Because this is not IO intensive, I have elected to use Christopher
Jefferson's suggested solution, ParListByFork.
I am still reading through the documentation for SCSCP, and think I will
try to write as many Bash scripts as possible to automate the creation and
farming of tasks. It seems like the way to go is to have a pair of GAP
files - one for the servers that creates all the necessary structures, and
one "main" that is aware of the lists and orderings to make each server
operate on a particular sublist.
Again, I really appreciate it. I am impressed and humbled by how responsive
and helpful the GAP community is.
Best,
Alan
On Tue, May 18, 2021 at 4:23 AM Alexander Konovalov <
alexander.konovalov at st-andrews.ac.uk> wrote:
> Hi Alan and Sergio,
>
> In addition to the SCSCP manual, I have
> https://github.com/alex-konovalov/scscp-demo
> which I was using for the advanced part of the GAP Software
> Carpentry-style lesson
> (https://carpentries-incubator.github.io/gap-lesson/).
>
> You may find it useful to set up things, following its README. In
> particular, you
> can have a script to set up a "farm" of GAP SCSCP servers in one go. From
> my
> experience, for a good load balancing, to run it on N cores, you can start
> N
> servers and one main GAP process from which you will be calling those
> servers,
> and the performance will be better than (N-1) servers and one main. If you
> balance the load well, then the main process will be mostly idle, so you
> can
> utilise the N-th core better. But that may depend on the tasks granularity,
> and on the data that you pass between nodes. A case study explaining
> optimising
> this in one particular case can be found here:
>
> A. Konovalov and S. Linton. 2010. Parallel computations in modular group
> algebras.
> In Proceedings of the 4th International Workshop on Parallel and Symbolic
> Computation (PASCO '10). Association for Computing Machinery, New York,
> NY, USA,
> 141–149. DOI:https://doi.org/10.1145/1837210.1837231
>
> In your case
>
> >> master_list := [ [...], [...], ..., [...]];
> >> n := Length(master_list);
> >> cores:=15;
>
>
> what are the objects that will be arguments of remote procedure calls, and
> what
> are the results returned back?
>
> Best wishes
> Alexander
>
>
>
>
>
> > On 18 May 2021, at 07:26, Sergio Siccha <sergio at mathb.rwth-aachen.de>
> wrote:
> >
> > Hi Alan,
> >
> > unfortunately `RunTask` in GAP is more or less only a wrapper for
> > `CallFuncListWrap` and is indeed blocking. `RunTask` in GAP only is a
> > mock-up of the proper `RunTask` function from HPC-GAP. HPC-GAP didn't
> > make it out of alpha stage though and if I'm not mistaken nobody is
> > working on it ATM.
> >
> > I'm not sure whether that's the best option as I have never used it, but
> > in principle you should be able to do parallel computations by spawning
> > several GAP processes via the SCSCP package, see chapter 8 "Parallel
> > computing with SCSCP" of the SCSCP manual. So I guess as long as your
> > computations don't need a lot of memory that should be fine.
> >
> > Hope this helps! :-)
> >
> > Best,
> > Sergio
> >
> >
> > On 18.05.21 07:35, Alan Hylton wrote:
> >> Howdy,
> >>
> >> Suppose I have a list of lists, and I wish to run some time-consuming
> >> process on each of these sub-list (each sub-list is independent, so I am
> >> not worried about race conditions).
> >>
> >> I think the easiest way to demonstrate my thought process is with code:
> >>
> >> I have a list of lists and the number of cores I wish to use -
> >> master_list := [ [...], [...], ..., [...]];
> >> n := Length(master_list);
> >> cores:=15;
> >>
> >> I have a time consuming function whose arguments are ranges into
> >> master_list -
> >> time_consuming_func := function(start_index, stop_index)
> >> ...
> >> end;
> >>
> >> I portion out master_list, creating a list a tasks -
> >> task_list:=[];
> >> start:=1;
> >> for i in [1..cores] do
> >> flag:=0;
> >> if i <= n mod cores then
> >> flag:=1;
> >> fi;
> >> if i > n then
> >> break;
> >> fi;
> >> len:=Int(n/cores)+flag;
> >>
> >> Add(task_list, RunTask( time_consuming_func , start, start+len-1));
> >>
> >> start:=start+len;
> >> od;
> >>
> >> I had several hopes:
> >> 1: I could get a list of tasks, and then use something like
> TaskFinished to
> >> see if each are done
> >> 2: Store the result of each time_consuming_func in the global
> master_list
> >>
> >> But I ran into one problem: RunTask seems to be blocking. Instead of
> >> spawning a process and continuing with my loop, it waits until each task
> >> finishes. I considered DelayTask instead of RunTask so that I could just
> >> use WaitTask, but DelayTask does not seem to exist (similarly for
> >> asynchronous tasks). Is there an alternative? To follow the
> documentation,
> >> I'd like to avoid the lower-level CreateThread if I can.
> >>
> >> Also, number 2 makes some assumptions on how memory works. Is it
> actually
> >> valid to have a thread working on element i of master_list modify the
> >> global master_list[i]?
> >>
> >> I'd greatly appreciate any insight!
> >>
> >> Thanks,
> >> Alan
> >> _______________________________________________
> >> Forum mailing list
> >> Forum at gap-system.org
> >> https://mail.gap-system.org/mailman/listinfo/forum
> >>
> >
> > _______________________________________________
> > Forum mailing list
> > Forum at gap-system.org
> > https://mail.gap-system.org/mailman/listinfo/forum
>
>
More information about the Forum
mailing list