If I have two Python algorithms, one of which needs to call the other, what’s the best and neatest way to define them?
(There’s a good reason for splitting up - the parent does a lot of work and creates an intermediate result which the child then processes - or perhaps a parent “workflow” algorithm calls the slow first stage and then the post processor - and I want the History of the final workspace to be set so that the final processing can be re-run on its own, defaulting to the same settings as before)
If both are defined in the same file, attempting to use the child in the normal way by just calling it will fail - the name of the child algorithm refers to the “class” and calling it in the usual way will fail.
I could define the child in one file and have it loaded by default on start-up, and then define the parent in another file to be run from the Script Window - that’s not too convenient.
If I define both in separate files and run them on start up - how do I know what order they might run?
If I run them separately in the Script Window, the algorithm definitions in one aren’t seen in the other window, a long standing “feature”.
Or do I have to do a messy solution with createChildAlgorithm(), remembering to set enableHistoryForChild() ?
Hi James,
The best way is probably to have the algorithms in separate files, both run at start up. The order they are run in doesn’t matter - if the parent is loaded before the child, it will import a fake child that will be replaced when the real child algorithm is loaded.
This can still result in a clash between the algorithm and class names, though. To get around this, in your parent algorithm you can import mantid.simpleapi as s_api
and then call the child algorithm as s_api.ChildAlgorithm(...)
.
I see. So after initialisation is complete, or perhaps as initialisation proceeds, mantid.simpleapi is updated to include all the names of the user defined algorithms, and it’s then imported by default in the Python Window panes.
But it doesn’t work for a script run interactively. My script is likely to need some debugging and maybe extending as it’s used, and having to restart Mantid each time, implying either throwing away intermediate results or saving and reloading them, will be a pain. But if I’ve run the script once on start up, re-running in the Python Window usually overwrites the algorithms so that’s OK.
Meanwhile the attached script implements the “workflow” type operation, using CreateChildAlgorithm(), where the individual stages record their history but there’s no overall history being saved. It seems to be necessary to pass the workspace/table names around as strings at the top level.AlgorithmFamilyTree.py (1.9 KB)
Each time you re-run the script in the script window, the algorithm will be re-registered, so you shouldn’t need to restart Mantid. As long as the scripts were loaded at startup, the names will already be there and the algorithm will be replaced when the script is re-run (the AlgorithmFactory.subscribe(...)
line).
To have all the algorithms in one file, the approach you have used looks like the way to go. At the moment, though, the output workspaces are being produced by the child algorithms, so won’t show the parent algorithm in their history. To change this, set them as the output properties of the parent - I’ve attached an example as AlgorithmFamilyTree-2.py (2.3 KB).
That will just show the parent in the history, though. To make it “unrollable” so that the child algorithms are visible in the history too, the parent should be a “workflow algorithm” (inherit from DataProcessorAlgorithm
instead of PythonAlgorithm
).
Hi Tom,
Your script works as described with only the parent in the history of the intermediate and final workspaces. If I change the definition to “class ParentAlg(DataProcessorAlgorithm):” it behaves the same, still no acknowledgement of the children. And if I then try to help it out by changing the two lines “fch.setChild(True)” to False, or comment out that line, I get an error from self.setProperty(“Intermediate”, inter) two lines later:
Error in execution of algorithm ParentAlg:
When converting parameter “Intermediate”: Unable to extract boost::shared_ptr from Python object
at line 16 in ‘C:/MantidInstall/UserAlgorithms/AlgorithmFamilyTree-2.py’
In the documentation for DataProcessorAlgorithm I found a method “enableHistoryRecordingForChild” and tried that - again no child history appeared.
Hi James,
I’ve had another look at this - when the parent algorithm is a DataProcessorAlgorithm, you can create the child algorithms with ch = self.createChildAlgorithm("ChildAlg")
, and the children appear in the history. (In this case there is no need to use ch.setChild(True)
). When I run this version AlgorithmFamilyTree-3.py (2.3 KB) I can see the children in the history.
Nearly there. There’s still a problem with the names of the workspaces when unrolled.
Rolled-up parent history:
ParentAlg(N='12', P=0.29999999999999999, Intermediate='rawdat8', FinalTable='cooked8')
Unrolled to show child history, which won’t now work if run as a script:
`# Child algorithms of ParentAlg
FirstChildAlg(Number=‘12’, OutputTab=’__inter’)
ChildAlg(InputTab=’__TMP00000000245AC0A0’, Power=0.29999999999999999, OutputTab=’__final’)
End of child algorithms of ParentAlg`
I can fix the names of the output workspaces with for example:
ch.setPropertyValue("OutputTab", self.getProperty("FinalTable").valueAsStr)
but can’t see a similar way to get the input workspace of the second algorithm named properly, or even consistent with the name given to it as output from the first algorithm.
Hi James,
For a workspace to have a name, it has to be in the ADS (visible in the “Workspaces” pane). The intermediate workspace is an output property of ParentAlg
, and only gets put in the ADS when ParentAlg
finishes, even if you set the output property (self.setProperty("Intermediate", inter)
) sooner.
That means when the second ChildAlg
gets called, its input workspace doesn’t have a name yet, which is why you see the temporary name __TMP0000...
.
To get around this, the only way I can see is manually adding the workspace to the ADS before ChildAlg
is called, then the workspace will have a name in the history. I’ve tried this here AlgorithmFamilyTree-4.py (2.6 KB) and the history looks OK now.
Thanks, looks to be working properly at last!
One related question that doesn’t seem to be documented well: if I have a Workflow algorithm, how do I get the “Progress” to work properly? The child algorithms report progress from 0 to 100% if run on their own. If I do nothing special within the parent, the child progress isn’t displayed.
I could for example guess child 1 takes 75% of the total time and child 2 takes 25%, maybe after inspecting the parameters and input workspaces. I’d like the progress bar of the parent (or the one visible, anyway) to run from 0 to 75 while child 1 is running, then 75 to 100 while child 2 runs. Even just having the bar run from 0 to 100 twice would be helpful.
James.
Hi James,
There are two arguments to createChildAlgorithm
called startProgress
and endProgress
. They should be set in the parent algorithm to values between 0 and 1.
For example in your case child 1 would be created with startProgress=0.0
and endProgress=0.75
, and child 2 would have startProgress=0.75
and endProgress=1.0
.
Each child algorithm reports its own progress from 0 to 100% as normal, and the parent handles scaling these reported values to 75% / 25%.
(As for how to determine the 75/25 ratio, I think this has to be a guess!)
Thanks, works well and even has the optional messages from Prog.report(“Doing stuff”) so the split between the two stages is visible to the users.
I think the split is more like 95:5 in reality. I can do some test runs and calibrate it. But I may make the final stage occupy no less than 5% of the progress bar so it’s still obviously there.
James.
Third related question. One of my child algorithms is doing many routine operations with standard Mantid algorithms like GroupWorkspaces, Fit, Plus, etc. As a result lots of log messages are produced just saying
Scale started (child)
Scale successful, Duration 0.00 seconds
Plus started (child)
Plus successful, Duration 0.00 seconds
and any important log messages I wanted to write are lost in the noise.
How do I shut these up? I’m calling these built in algorithms as simple Python function calls rather than assembling the arguments one by one with self.createChildAlgorithm() as it’s far easier that way.
As an aside my child algorithms run from the parent Workflow algorithm via self.createChildAlgorithm() DON’T get their own “child1 started” and “child1 successful, Duration 32.57 seconds” log messages. Those might be useful.
Hi James,
To turn off logging for an algorithm called as a simple Python function call, pass EnableLogging=False
as an extra argument to the function, e.g. ws = Scale(input, Factor=2, Operation='Multiply', EnableLogging=False)
.
A child algorithm created the long way round (via self.createChildAlgorithm
) does not log start and end messages, unless you tell it to store workspaces it creates in the ADS like this:
alg.setAlwaysStoreInADS(True)
A child algorithm run as a simple Python function call has this property set automatically, which is why it logs its start, end and duration.
Setting this option on your child algorithms created via self.createChildAlgorithm
will turn on the logging. Changing this option means that you don’t need to manually add the intermediate workspace to the ADS - instead, the output properties of the parent algorithm can be set to the workspace names (setPropertyValue
rather than setProperty
).
I don’t think this is documented anywhere, so will look into getting this point (and some of the points above, like progress bars) added to the documentation in future!