Skip to content
Snippets Groups Projects
Commit df65f633 authored by Plougonven Erwan's avatar Plougonven Erwan
Browse files

Update README.md

parent 92e515bb
No related branches found
No related tags found
No related merge requests found
......@@ -4,6 +4,14 @@
Python code for reading profilometry data from a Veeco Wyko NT9100, preprocessing it, and saving it into another format for further processing. In the Veeco software, the file should be saved in ASC format.
![Options for the ASC format in the Veeko software.](assets/Veeko_ASC_save_opts.png)
### Notes
- [ ] There are several options in the Veeco software for saving in the ASC format. I think all should work except for the `XYZ Triplet, Pixel` option.
- [ ] The code useful for the user is contained in a `main()` so I can put that at the top, and all the internal stuff you don't care for at the end, and keep everything in one file.
## Usage
- [ ] Copy the python file in the directory where the ASC file is stored.
......@@ -11,22 +19,65 @@ Python code for reading profilometry data from a Veeco Wyko NT9100, preprocessi
- [ ] Run the code.
## Example
Illustration of the steps the code performs, form left to right: reading, cleaning, filling.
![Illustration of the three steps, reading, cleaning, filling.](assets/Example.png)
## Functions
The data passed between function is stored in a [NumPy array](https://numpy.org/doc/stable/reference/arrays.html).
### `extract_Wyko_ASCII_data`
Simple usage, the only parameter is the input file name. It outputs the profilometry data, what I call the height field `hfld`, as well as an intensity field `ifld`. Here's an illustration of the intensity field from the data shown in the example above:
![Illustration of an intensity field.](assets/Example_intensity.png)
### `removePeaks`
Custom function to cleanup the data values, i.e. remove the spurious peaks. I define a peak as a Dirac impulse (1 pixel wide), either positive or negative, *wildly different* from its neighbours.
The function takes in the input field (a `NumPy` array), outputs the filtered field, and has an optional `diffth` variable to set to define what *wildly different* means: it's a threshold for the difference between the pixel value and its [4-neighbours](wikipedia.org/wiki/Pixel_connectivity#2-dimensional), if the difference is above this threshold, then the pixel is considered as a peak.
I had to write a custom (and slow) function because the data can contain pixel with no values (which I'll call undefined, and are set to `NaN` in the code), and that creates a lot of problems when peaks are next to undefined pixels, making typical filters like a median non-optimal.
The algorithm has three parts:
1. Finds all [4-connected components](http://dx.doi.org/10.1016/0734-189X(89)90147-3) of defined pixels (i.e. groups of defined pixels surrounded by undefined ones) of size 1 or 2, directly remove the ones of size 1 (because there is no way of locally determining whether it's a spurious value or not), and flag those of size 2. By *removing*, I mean set the pixel to undefined.
2. For all defined pixels, compute the local *gradient* as the absolute difference between its value and the average of its (defined) neighbours. If above `diffth` AND part of one of the flagged size 2 connected components, then remove it and its neighbour (because there's no way of determining whether it's that pixel or the neighbour that's the peak). If above `diffth` and is NOT part of the flagged connected components, then add that pixel to an ordered set (ordered by decreasing *gradient*).
3. Iteratively remove the peaks, starting from the highest. For each peak in that ordered set (starting with the one with largest gradient), check if it's still a peak, and if so, set the pixel value to the average of its neighbours. This had to be done iteratively because with high peaks, its neighbours can also be flagged as peaks, as their local gradient is influenced by the peak. The illustration below shows this in one dimension. With this iterative process, the central pixel (the real peak) is filtered, so when we next look at the neighbours, their local gradient is no longer above `diffth`, and are therefore no touched.
![Illustration of the need for an iterative process to remove the peaks. The left graph shows pixels values in a one-dimensional image, along a line, with a high peak in the centre. The right graph shows the local gradients, while the dotted line shows the threshold `diffth`, meaning that the neighbours of the peak are also flagged as peaks.](assets/diffth_example.png)
This method is optimal in the sense that is modifies the minimum amount of pixels values (compared to a blunt filter like a median).
### `fill_novals`
Once the peaks are removed, the undefined pixels can be filled. This function takes the field as input (a `NumPy` array) and outputs a field with no more undefined values. It has two optional arguments, `noval` to select the filling method, and `cval` the parameter for the first of the four methods implemented:
1. `constant`
Set undefined pixels to a constant value given by `cval`, default is 0.
2. `average`
Set undefined pixels to the overall average value.
3. `interpolate`
Perform a linear interpolation in the undefined regions, using the method [scipy.interpolate.griddata](docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.griddata.html). Since this method only fills the convex hull of the defined values, a subsequent call to replace all remaining undefined pixels near the border with a local average is called.
4. `inpaint` (the default)
Perform an [inpainting](en.wikipedia.org/wiki/Inpainting) using the function [skimage.restoration.inpaint_biharmonic](scikit-image.org/docs/stable/api/skimage.restoration.html#skimage.restoration.inpaint_biharmonic).
#### Note
If this function is called with a field in which the peaks are not removed, it can produce weird results with the two last methods, as the peaks next to undefined values will affect the filling.
### Saving functions
These functions take as inputs the field and output filename.
#### `save_PNG`
Saves to Portable Network Graphics format. In this format, the values have to be rescaled to the range `[0;255]` and converted to unsigned 8-bit values.
#### `save_Avizo_ASCII`
## Notes
Saves to an [Avizo](www.thermofisher.com/us/en/home/electron-microscopy/products/software-em-3d-vis/avizo-software.html) Ascii format. The advantage of this format is that no conversion is needed, the real values are conserved.
- [ ] There are several options in the Veeco software for saving in the ASC format. I think all should work except for the `XYZ Triplet, Pixel` option.
- [ ] The code useful for the user is contained in a `main()` so I can put that at the top, and all the internal stuff you don't care for at the end, and keep everything in one file.
This function can be generalised to any text file format, if you give it the header and delimiter as parameters.
## Acknowledgements
Thanks to [Laura Manceriu](https://my.uliege.be/tr/view2.do?as_codULg=U224648) for the acquisitions and for providing the data.
  • Melissa Zebrowski @melissa59zebrowski ·

    Thank you for the detailed explanation and code! It’s incredibly helpful for processing profilometry data and streamlining the workflow. Much appreciated! Smart Square SSM

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment