Integration algorithm not giving sensible or reproducible results

Ian · May 19, 2020, 7:34am

Expected behavior

Sensible integration

Actual behavior

Not sensible integration. Result gives values way beyond realistic range. Repeated running of script below also gives different results.

Steps to reproduce the behavior

    # The following line helps with future compatibility with Python 3
    # print must now be used as a function, e.g print('Hello','World')
    from __future__ import (absolute_import, division, print_function, unicode_literals)
    
    # import mantid algorithms, numpy and matplotlib
    from mantid.simpleapi import *
    
    import matplotlib.pyplot as plt
    
    import numpy as np
    
w = np.linspace(-0.5,0.5,1000)
q = np.linspace(0.2,2,50)
W,Q = np.meshgrid(w, q)
g = 0.15*Q**2
S = g/(W**2+g**2)
rnd = 25*np.random.rand(50000).reshape(50,1000)
fake_data = S+rnd

test_ws = CreateWorkspace(DataX = W, DataY = fake_data, NSpec=50)
fig, axs = plt.subplots(2,1)
axs[0].pcolormesh(W,Q, (S+rnd))

def quickinteg(workspace):
    ''' Takes a reduced workspace and integrates all spectra over an energy range'''
    ws = mtd[workspace]
    el_y, el_e = [],[]
    spec = [i for i in range(ws.getNumberHistograms())]
    for i in spec:
        print('integrating workspace %s' %i)
        elastic_ws = Integration(ws, StartWorkspaceIndex=i, EndWorkspaceIndex=i,
                               RangeLower = -0.0175, RangeUpper = 0.0175)
        elasticy , elastice = elastic_ws.readY(0), elastic_ws.readE(0)
        el_y.append(elasticy); el_e.append(elastice)
    
    all_el_y = np.stack(el_y, axis = -1); all_el_e = np.stack(el_e, axis = -1)
       
    elastic = CreateWorkspace(DataX=spec, DataY=all_el_y, DataE=all_el_e, NSpec = 1, UnitX='MomentumTransfer', YUnitLabel = 'Elastic Intensity')
    RenameWorkspaces([elastic], Prefix = '%s_' %ws.name())
    
quickeisf('test_ws')
int_ws = mtd['test_ws_elastic']
x, y = int_ws.readX(0), int_ws.readY(0)

axs[1].plot(x,y)

plt.show

Platforms affected

Problem seen using Workbench 5.0.0 on mac

DanielMurphy22 · May 19, 2020, 8:48am

Hi,

I’ve run this script, assuming you meant quickinteg(‘test_ws’) on the 5th last line.
I think the reason you are getting a different answer each time is because the input data involves np.random.rand()

Is there something I’m missing that is behaving unexpectedly?

Ian · May 19, 2020, 9:05am

Hi Daniel, I’ve taken out the random input, and I’m still getting a dodgy result.

The integrated intensity in the 2nd plot should follow a smooth curve, but I’m getting some crazy outliers (nan and powers to many hundreds).
I’ve updated the script and tested it - see below. The difference between the median and max values is of concern.

# The following line helps with future compatibility with Python 3
# print must now be used as a function, e.g print('Hello','World')
from __future__ import (absolute_import, division, print_function, unicode_literals)

# import mantid algorithms, numpy and matplotlib
from mantid.simpleapi import *

import matplotlib.pyplot as plt

import numpy as np

w = np.linspace(-0.5,0.5,1000)
q = np.linspace(0.2,2,50)
W,Q = np.meshgrid(w, q)
g = 0.15*Q**2
S = g/(W**2+g**2)
fake_data = S

test_ws = CreateWorkspace(DataX = W, DataY = fake_data, NSpec=50)
fig, axs = plt.subplots(2,1)
axs[0].pcolormesh(W,Q,S)

def quickinteg(workspace):
    ''' Takes a reduced workspace and integrates all spectra over an energy range'''
    ws = mtd[workspace]
    el_y, el_e = [],[]
    spec = [i for i in range(ws.getNumberHistograms())]
    for i in spec:
        print('integrating workspace %s' %i)
        elastic_ws = Integration(ws, StartWorkspaceIndex=i, EndWorkspaceIndex=i,
                               RangeLower = -0.0175, RangeUpper = 0.0175)
        elasticy , elastice = elastic_ws.readY(0), elastic_ws.readE(0)
        el_y.append(elasticy); el_e.append(elastice)
    
    all_el_y = np.stack(el_y, axis = -1); all_el_e = np.stack(el_e, axis = -1)
       
    elastic = CreateWorkspace(DataX=spec, DataY=all_el_y, DataE=all_el_e, NSpec = 1, UnitX='MomentumTransfer', YUnitLabel = 'Elastic Intensity')
    RenameWorkspaces([elastic], Prefix = '%s_' %ws.name())
    
quickinteg('test_ws')
int_ws = mtd['test_ws_elastic']
x, y = int_ws.readX(0), int_ws.readY(0)

print (np.mean(y))
print (np.median(y))
print (np.max(y))

axs[1].plot(x,y)

plt.show

mducle · May 19, 2020, 10:36am

Hi Ian,

the problem is that you’re using the readY and readE methods. The data in the workspaces are stored as C++ arrays and the read* methods give a “view” into these arrays in memory (that is, elasticy and elastice are “pointers” to the memory location of the data). However after each iteration you overwrite the elastic_ws workspace so Mantid sees that the old workspace memory is no longer need and frees it (in C++). However, you still have those “views” into the now freed memory which basically points at random giberish (also known as “dangling pointers”). It’s a bit of a subtle point which is not very well explained in the documentation. The reason Mantid works like this is because to create a copy of the data as a Python variable could become computationally expensive.

There is another method to get the data which does create a copy, called extractY and extractE etc. So if you change the line:

        elasticy , elastice = elastic_ws.readY(0), elastic_ws.readE(0)

to

        elasticy , elastice = elastic_ws.extractY()[0], elastic_ws.extractE()[0]

the slightly different syntax is because extract* extracts the whole data array and not just one spectrum (the reason the read* methods need to read individual spectrum is because that is how the data are arranged in C++, each spectrum are treated as independent 1-D arrays even if they are next to each other in memory).

However, a much simpler syntax is:

test_ws_elastic = Integration('test_ws', -0.0175, 0.0175)
test_ws_elastic = Transpose('test_ws_elastic')
int_ws = mtd['test_ws_elastic']
x, y = int_ws.readX(0), int_ws.readY(0)

In your case this produces an off-by-1 error in the x-axis compared with quickinteg because in quickinteg you used a zero-based python list of the number of histograms whereas using Integration on the whole workspace directly gives the x-axis as the (one-based) spectrum number.

All the best,

Duc.

mducle · May 19, 2020, 10:44am

An alternative to using extract* is to wrap the output in a float - this will create an explicit copy of the variable too - e.g.

        elasticy , elastice = float(elastic_ws.readY(0)), float(elastic_ws.readE(0))

Ian · May 19, 2020, 12:01pm

Thanks Duc,

I’d tried extractX() without success, assuming that since it was a workspace with only one value it would be fine. I didn’t realise it need the square brackets. I’m still learning!

Ian · May 19, 2020, 12:14pm

I learned this habit from the python in Mantid tutorial here. It might be worth adding a note or updating the documentation.

DanielMurphy22 · May 19, 2020, 12:20pm

Thank you very much for pointing this out! I’ll make sure what’s on here is changed.

mducle · May 19, 2020, 6:43pm

I’ve updated that wiki page and also related pages (MatrixWorkspace Attributes - Mantid Project and Python Reading Workspace Data - Mantid Project ). I think these pages are quite old and not used much anymore (by developers anyway) so they have not been updated.

The developers only really look at the wiki pages used in the training course, and the documentation built from the source code using sphinx ( Mantid ).

Daniel, it might be an idea to remove these wiki documentation (or at least prune them) or move them to sphinx.

DanielMurphy22 · May 20, 2020, 8:48am

I’m actually in the process of moving this Python in Mantid Course to the docs and updating it. I’ll make sure to include your edits.