v2.2.0 - Adjoint flow and cleaning of tbox
Waves v2.2.0 (flow v1.8.0) - Adjoint solver
This release adds a first version of a validated adjoint solver for flow. It enables the computation of gradients with respect to lots of design variables (such as the mesh coordinates), for a low cost. Additionally, the way of building element matrices for the different modules have been reviewed, and several classes have been refactored or cleaned accordingly.
Changes
- Add adjoint flow solver for gradients computation
- Clean Element/Mem classes (#49)
- Refactor functions
- Minor changes and bugfixing (thx @R.Boman)
Notes
Element/Mem
- All the
build()
methods that were specific to a given physics have been removed from theElement
class. These methods have been move to their respective module in...Term
,...Residual
and...Functional
classes. Additionally, the methods are now static. -
Element
now only stores geometric and basic computational information. Accordingly, all the data previously stored in theMem
class gave been gathered intoElement
, andMem
has been removed. The data (Jacobian, normal vector, etc.) must be initialized before any computation can be carried out. This action is performed using theinitValues()
method. Additionally, the order of integration for the Gaussian quadrature can be set using this method. Since the gradients of some data (Jacobian, etc.) is required for some computations, they can be initialized using a separate methodinitGradients()
. Once all the data have been initialized, the can be updated as before using theupdate()
method.
Functions
- All the functions used in flow (
F0El
, etc.) are now specific to this module. Additionally, they derive from the newObserver
class and are not wrapped in python anymore. - An
Observer
class has been added to fwk. It provides a mechanism for self-updating, based on changes occurring in a given class. - The functions of tbox (
Fct0
, etc.) are now smart pointers (40171e0c (comment 5682) has been fixed). Note that these functions are only used in heat and could be moved outside of tbox.
Tests
Tests are passing on ubuntu20.04 (python 3.8.2), and msys2 (python 3.8.1).
Merge request reports
Activity
This commit adds smart pointer support for the function classes in tbox (
Fct...
), heat (CompiledFct...
) and flow (F...
).At this point the code is:
- compiling and running in Release config
- compiling and running in RelWithDebugInfo config, except for flow tests
- compiling but failing at runtime in Debug config, for all tests
I could not identify the cause of the failure at runtime in debug, but this happens since v1.7.1 at least. So, this commit is not responsible for that failure and I will not talk about it.
Regarding the failure at runtime for flow tests, it is triggered by these lines:
initial = Initial(msh, "group", F0PsPhiInf(...)) dirichlet = Dirichlet(msh, "group", F0PsPhiInf(...))
where
Initial
andDirichlet
both derive fromAssign
, andF0PsPhiInf
derives fromF0Ps
. The constructor ofAssign
isAssign(shared_ptr<MshData>, string, shared_ptr<F0Ps>)
.
More specifically, the cause of the failure is an assertion triggered in a SWIG generated function (that explains why it does not fail in release):int SWIG_Python_ConvertPtrAndOwn(PyObject*, void**, swig_type_info*, int, int*) { // ... if (own) { int newmemory = 0; *ptr = SWIG_TypeCast(tc,vptr,&newmemory); if (newmemory == SWIG_CAST_NEW_MEMORY) { assert(own); /* badly formed typemap which will lead to a memory leak - it must set and use own to delete *ptr */ if (own) *own = *own | SWIG_CAST_NEW_MEMORY; } } // ... }
where
int *own
is the last parameter passed toSWIG_Python_ConvertPtrAndOwn
.
After a few tests, I realized that printing*own
resulted in a segfault. Note that several of my classes pass through this bit of code, but never trigger the assertion since*own = 0
. I do not understand what is happening there.What is even more puzzling is that I have other classes, which do exactly the same thing, but with function classes from tbox, and they do not fail. A simple example is the
Freestream(shared_ptr<MshData>, string, shared_ptr<Fct1>)
class, that I instantiate in python asfreestream = Freestream(msh, "group", Fct1C(...))
, whereFct1C
derives fromFct1
.Finally, I should mention that I ran valgrind on both the RelWithDebugInfo (after commenting out the
assert(own)
) and the Release configs, and that it detected no memory leaks.In the end, I have several questions:
- What is happening and why?
- Why do the classes deriving from
flow::F0Ps
cause an issue, while the classes deriving fromtbox::Fct0
do not? - Can we trust valgrind when it says that there is no memory leak (knowing that the comment after the assertion in the SWIG function indicates that there will be a memory leak)?
Additional notes if you got so far without dying:
- This issue https://github.com/swig/swig/issues/731 is still opened in the SWIG repo and is basically the same I run through
- If I change
Assign
to accept the derived classF0PsPhiInf
(instead of the baseF0Ps
), it works fine. I also tried to replaceF0ps
byFct0
, and it runs. - I encountered no issue, nor memory leaks due to the combined use of
shared_ptr
anddirector
, which was my initial fear.
Many thanks in advance.
- mesh size has been slightly adjusted in nonlift test case to ensure proper convergence
- Dirichlet BCs have been kept in bli test case due to lack of robustness with all Neumann BCs, they will have to be removed at some point
- the solver is slightly slower (2%) when only Neumann BCs are used (consistent, because it takes more time to compute and assemble a Neumann BC than a Dirichlet BC)
added cleaning enhancement labels
The build works on Windows with Visual Studio 2017.
Only lift3 fails:
98% tests passed, 1 tests failed out of 63 The following tests FAILED: 22 - flow/tests/lift3.py (Failed) Errors while running CTest
with the following results:
[CTest] iteration count = 3.000000 (expected 3.000000 +/- 1.000000) abs diff = 0.000000e+00 <= 1.000000e+00 [ok] [CTest] CL = 0.132850 (expected 0.135000 +/- 5.0%) rel diff = 1.592774e-02 <= 5.000000e-02 [ok] [CTest] CD = 0.006489 (expected 0.006200 +/- 0.010000) abs diff = 2.886878e-04 <= 1.000000e-02 [ok] [CTest] CS = 0.012392 (expected 0.012000 +/- 1.0%) rel diff = 3.268972e-02 > 1.000000e-02 [wrong!] [CTest] CM = -0.027948 (expected -0.027800 +/- 1.0%) rel diff = 5.319034e-03 <= 1.000000e-02 [ok]