Convert Veeco Wyko NT9100 ASC format
Description
Python code for reading profilometry data from a Veeco Wyko NT9100, preprocessing it, and saving it into another format for further processing.
A picture of the Veeko profilometer.
In the Veeco software, the file should be saved in ASC format.
Options for the ASC format in the Veeko software.
Notes
-
There are several options in the Veeco software for saving in the ASC format. I think all should work except for the
XYZ Triplet, Pixel
option. -
The code useful for the user is contained in a
main()
so I can put that at the top, and all the internal stuff you don't care for at the end, and keep everything in one file.
Usage
- Copy the python file in the directory where the ASC file is stored.
-
Edit the input and output file names in the
main()
function. - Run the code.
Example
The following figure shows an example of the process on data from a face of a 2 mm polyhedral particle.
Illustration of the steps the code performs, form left to right: reading, cleaning, filling.
Functions
The data passed between function is stored in a NumPy array.
extract_Wyko_ASCII_data
Simple usage, the only parameter is the input file name. It outputs the profilometry data, what I call the height field hfld
, as well as an intensity field ifld
.
The intensity field from the data shown in the example above.
removePeaks
Custom function to cleanup the data values, i.e. remove the spurious peaks. I define a peak as a Dirac impulse (1 pixel wide), either positive or negative, wildly different from its neighbours.
The function takes in the input field (a NumPy
array), outputs the filtered field, and has an optional diffth
variable to set to define what wildly different means: it's a threshold for the difference between the pixel value and its 4-neighbours, if the difference is above this threshold, then the pixel is considered as a peak.
I had to write a custom (and slow) function because the data can contain pixel with no values (which I'll call undefined, and are set to NaN
in the code), and that creates a lot of problems when peaks are next to undefined pixels, making typical filters like a median non-optimal.
The algorithm has three parts:
- Finds all 4-connected components of defined pixels (i.e. groups of defined pixels surrounded by undefined ones) of size 1 or 2, directly remove the ones of size 1 (because there is no way of locally determining whether it's a spurious value or not), and flag those of size 2. By removing, I mean set the pixel to undefined.
- For all defined pixels, compute the local gradient as the absolute difference between its value and the average of its (defined) neighbours. If above
diffth
AND part of one of the flagged size 2 connected components, then remove it and its neighbour (because there's no way of determining whether it's that pixel or the neighbour that's the peak). If abovediffth
and is NOT part of the flagged connected components, then add that pixel to an ordered set (ordered by decreasing gradient). - Iteratively remove the peaks, starting from the highest. For each peak in that ordered set (starting with the one with largest gradient), check if it's still a peak, and if so, set the pixel value to the average of its neighbours. This had to be done iteratively because with high peaks, its neighbours can also be flagged as peaks, as their local gradient is influenced by the peak. The illustration below shows this in one dimension. With this iterative process, the central pixel (the real peak) is filtered, so when we next look at the neighbours, their local gradient is no longer above
diffth
, and are therefore no touched.
Illustration of the need for an iterative process to remove the peaks. The left graph shows pixels values in a one-dimensional image, along a line, with a high peak in the centre. The right graph shows the local gradients, while the dotted line shows the threshold diffth
, meaning that the neighbours of the peak are also flagged as peaks.
This method is optimal in the sense that is modifies the minimum amount of pixels values (compared to a blunt filter like a median).
fill_novals
Once the peaks are removed, the undefined pixels can be filled. This function takes the field as input (a NumPy
array) and outputs a field with no more undefined values. It has two optional arguments, noval
to select the filling method, and cval
the parameter for the first of the four methods implemented:
-
constant
Set undefined pixels to a constant value given bycval
, default is 0. -
average
Set undefined pixels to the overall average value. -
interpolate
Perform a linear interpolation in the undefined regions, using the method scipy.interpolate.griddata. Since this method only fills the convex hull of the defined values, a subsequent call to replace all remaining undefined pixels near the border with a local average is called. -
inpaint
(the default) Perform an inpainting using the function skimage.restoration.inpaint_biharmonic.
Note
If this function is called with a field in which the peaks are not removed, it can produce weird results with the two last methods, as the peaks next to undefined values will affect the filling.
Saving functions
These functions take as inputs the field and output filename.
save_PNG
Saves to Portable Network Graphics format. In this format, the values have to be rescaled to the range [0;255]
and converted to unsigned 8-bit values.
Data saved as PNGs with this function, before (left) and after (right) processing. The black zones in the left image are the undefined pixel regions. Also, the left image seems much darker because of the rescaling: the peaks reach much higher values, so when bringing the value range down to [0;255]
, all the useful values are squished in the darker greylevels.
save_Avizo_ASCII
Saves to an Avizo Ascii format. The advantage of this format is that no conversion is needed, the real values are conserved.
This function can be generalised to any text file format, if you give it the header and delimiter as parameters.
Acknowledgements
Thanks to Laura Manceriu for the acquisitions and for providing the data.