summaryrefslogtreecommitdiffstats
path: root/tests/fsm
Commit message (Collapse)AuthorAgeFilesLines
* fsm: refuse state chg and events after termNeels Hofmeyr2019-10-291-1996/+292
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refuse state changes and event dispatch for FSM instances that are already terminating. It is assumed that refusing state changes and events after FSM termination is seen as the sane expected behavior, hence this change in behavior is merged without being configurable. There is no fallout in current Osmocom code trees. fsm_dealloc_test needs a changed expected output, since it is explicitly creating complex FSM structures that terminate. Currently no other C test in Osmocom code needs adjusting. Rationale: Where multiple FSM instances are collaborating (like in osmo-bsc or osmo-msc), a terminating FSM instance often causes events to be dispatched back to itself, or causes state changes in FSM instances that are already terminating. That is hard to avoid, since each FSM instance could be a cause of failure, and wants to notify all the others of that, which in turn often choose to terminate. Another use case: any function that dispatches events or state changes to more than one FSM instance must be sure that after the first event dispatch, the second FSM instance is in fact still allocated. Furthermore, if the second FSM instance *has* terminated from the first dispatch, this often means that no more actions should be taken. That could be done by an explicit check for fsm->proc.terminating, but a more general solution is to do this check internally in fsm.c. In practice, I need this to avoid a crash in libosmo-mgcp-client, when an on_success() event dispatch causes the MGCP endpoint FSM to deallocate. The earlier dealloc-in-main-loop patch fixed part of it, but not all. Change-Id: Ia81a0892f710db86bd977462730b69f0dcc78f8c
* add osmo_fsm_set_dealloc_ctx(), to help with use-after-freeNeels Hofmeyr2019-10-292-17/+3460
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simpler and more general solution to the problem so far solved by osmo_fsm_term_safely(true). This extends use-after-free fixes to arbitrary functions, not only FSM instances during termination. The aim is to defer talloc_free() until back in the main loop. Rationale: I discovered an osmo-msc use-after-free crash from an invalid message, caused by this pattern: void event_action() { osmo_fsm_inst_dispatch(foo, FOO_EVENT, NULL); osmo_fsm_inst_dispatch(bar, BAR_EVENT, NULL); } Usually, FOO_EVENT takes successful action, and afterwards we also notify bar. However, in this particular case, FOO_EVENT caused failure, and the immediate error handling directly terminated and deallocated bar. In such a case, dispatching BAR_EVENT causes a use-after-free; this constituted a DoS vector just from sending messages that cause *any* failure during the first event dispatch. Instead, when this is enabled, we do not deallocate 'foo' until event_action() has returned back to the main loop. Test: duplicate fsm_dealloc_test.c using this, and print the number of items deallocated in each test loop, to ensure the feature works. We also verify that the deallocation safety works simply by fsm_dealloc_test.c not crashing. We should probably follow up by refusing event dispatch and state transitions for FSM instances that are terminating or already terminated: see I0adc13a1a998e953b6c850efa2761350dd07e03a. Change-Id: Ief4dba9ea587c9b4aea69993e965fbb20fb80e78
* fsm_dealloc_test: no need for ST_DESTROYINGNeels Hofmeyr2019-04-112-1799/+1520
| | | | | | | | | | | | | | A separate ST_DESTROYING state originally helped with certain deallocation scenarios. But now that fsm.c avoids re-entering osmo_fsm_inst_term() twice and gracefully handles FSM instance deallocations for termination cascades, it is actually just as safe without a separate ST_DESTROYING state. ST_DESTROYING was used to flag deallocation and prevent entering osmo_fsm_inst_term() twice, which works only in a very limited range of scenarios. Remove ST_DESTROYING from fsm_dealloc_test.c to show that all tested scenarios still clean up gracefully. Change-Id: I05354e6cad9b82ba474fa50ffd41d481b3c697b4
* fsm: support graceful osmo_fsm_inst_term() cascadesNeels Hofmeyr2019-04-112-225/+3283
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add global flag osmo_fsm_term_safely() -- if set to true, enable the following behavior: Detect osmo_fsm_inst_term() occuring within osmo_fsm_inst_term(): - collect deallocations until the outermost osmo_fsm_inst_term() is done. - call osmo_fsm_inst_free() *after* dispatching the parent event. If a struct osmo_fsm_inst enters osmo_fsm_inst_term() while another is already within osmo_fsm_inst_term(), do not directly deallocate it, but talloc-reparent it to a separate talloc context, to be deallocated with the outermost FSM inst. The effect is that all osmo_fsm_inst freed within an osmo_fsm_inst_term() cascade will stay allocated until all osmo_fsm_inst_term() are complete and all of them will be deallocated at the same time. Mark the deferred deallocation state as __thread in an attempt to make cascaded deallocation handling threadsafe. Keep the enable/disable flag separate, so that it is global and not per-thread. The feature is showcased by fsm_dealloc_test.c: with this feature, all of those wild deallocation scenarios succeed. Make fsm_dealloc_test a normal regression test in testsuite.at. Rationale: It is difficult to gracefully handle deallocations of groups of FSM instances that reference each other. As soon as one child dispatching a cleanup event causes its parent to deallocate before fsm.c was ready for it, deallocation will hit a use-after-free. Before this patch, by using parent_term events and distinct "terminating" FSM states, parent/child FSMs can be taught to wait for all children to deallocate before deallocating the parent. But as soon as a non-child / non-parent FSM instance is involved, or actually any other cleanup() action that triggers parent FSMs or parent talloc contexts to become unused, it is near impossible to think of all possible deallocation events ricocheting, and to avoid running into freeing FSM instances that were still in the middle of osmo_fsm_inst_term(), or FSM instances to enter osmo_fsm_inst_term() more than once. This patch makes deallocation of "all possible" setups of complex cross referencing FSM instances easy to handle correctly, without running into use-after-free or double free situations, and, notably, without changing calling code. Change-Id: I8eda67540a1cd444491beb7856b9fcd0a3143b18
* fsm: add flag to ensure osmo_fsm_inst_term() happens only onceNeels Hofmeyr2019-04-111-38/+340
| | | | | | | | | | | To prevent re-entering osmo_fsm_inst_term() twice for the same osmo_fsm_inst, add flag osmo_fsm_inst.proc.terminating. osmo_fsm_inst_term() sets this to true, or exits if it already is true. Update fsm_dealloc_test.err for illustration. It is not relevant for unit testing yet, just showing the difference. Change-Id: I0c02d76a86f90c49e0eae2f85db64704c96a7674
* add fsm_dealloc_test.cNeels Hofmeyr2019-04-112-0/+598
| | | | | | | | | | | | | Despite efforts to properly handle "GONE" events and entering a ST_DESTROYING only once, so far this test runs straight into a heap use-after-free. With current fsm.c, it is hard to resolve the situation with the objects named "other" also causing deallocations besides the FSM instance parent/child relations. For illustration, add an "expected" test output file fsm_dealloc_test.err, making this pass will follow in a subsequent patch. Change-Id: If801907c541bca9f524c9e5fd22ac280ca16979a
* log: fsm: allow logging the timeout on state changeNeels Hofmeyr2019-02-262-10/+13
| | | | | | | | | | | | | | | | Add a flag that adds timeout info to osmo_fsm_inst state change logging. To not affect unit testing, make this an opt-in feature that is disabled by default -- mostly because osmo_fsm_inst_state_chg_keep_timer() will produce non-deterministic logging depending on timing (logs remaining time). Unit tests that don't verify log output and those that use fake time may also enable this feature. Do so in fsm_test.c. The idea is that in due course we will add osmo_fsm_log_timeouts(true) calls to all of our production applications' main() initialization. Change-Id: I089b81021a1a4ada1205261470da032b82d57872
* osmo_fsm_inst_state_chg(): set T also for zero timeoutNeels Hofmeyr2019-01-293-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, if timeout_secs == 0 was passed to osmo_fsm_inst_state_chg(), the previous T value remained set in the osmo_fsm_inst->T. For example: osmo_fsm_inst_state_chg(fi, ST_X, 23, 42); // timer == 23 seconds; fi->T == 42 osmo_fsm_inst_state_chg(fi, ST_Y, 0, 0); // no timer; fi->T == 42! Instead, always set to the T value passed to osmo_fsm_inst_state_chg(). Adjust osmo_fsm_inst_state_chg() API doc; need to rephrase to accurately describe the otherwise unchanged behaviour independently from T. Verify in fsm_test.c. Rationale: it is confusing to have a T number remaining from some past state, especially since the user explicitly passed a T number to osmo_fsm_inst_state_chg(). (Usually we are passing timeout_secs=0, T=0). I first thought this behavior was introduced with osmo_fsm_inst_state_chg_keep_timer(), but in fact osmo_fsm_inst_state_chg() behaved this way from the start. This shows up in the C test for the upcoming tdef API, where the test result printout was showing some past T value sticking around after FSM state transitions. After this patch, there will be no such confusion. Change-Id: I65c7c262674a1bc5f37faeca6aa0320ab0174f3c
* add osmo_fsm_inst_state_chg_keep_timer()Neels Hofmeyr2018-05-312-1/+105
| | | | Change-Id: I3c0e53b846b2208bd201ace99777f2286ea39ae8
* add osmo_fsm_inst_update_id_f()Neels Hofmeyr2018-04-092-0/+51
| | | | | | | | | | | | | | | | In the osmo-msc, I would like to set the subscr conn FSM identifier by a string format, to include the type of Complete Layer 3 that is taking place. I could each time talloc a string and free it again. This API is more convenient. From osmo_fsm_inst_update_id(), call osmo_fsm_inst_update_id_f() with "%s" (or pass NULL). Put the name updating into separate static update_name() function to clarify. Adjust the error message for erratic ID: don't say "allocate", it might be from an update. Adjust test expectation. Change-Id: I76743a7642f2449fd33350691ac8ebbf4400371d
* cosmetic: osmo_fsm_inst_update_id(): don't log "allocate"Neels Hofmeyr2018-04-091-2/+2
| | | | | | | | | On erratic id in osmo_fsm_inst_update_id(), don't say "Attempting to allocate FSM instance". Escape the invalid id using osmo_quote_str(). Change-Id: I770fc460de21faa42b403f694e853e8da01c4bef
* fsm: id: properly set name in case of NULL idNeels Hofmeyr2018-04-092-14/+1
| | | | | | | | | | Since alloc relies on osmo_fsm_inst_update_id() to set the name, never skip that. In osmo_fsm_inst_alloc(), we allow passing a NULL id, and in osmo_fsm_inst_update_id(), we set the name without id if id is NULL. Change-Id: I6d6b09a811b82770818f19b189a57d9fc4a8133b
* fsm_test: more thoroughly test FSM inst ids and namesNeels Hofmeyr2018-04-092-7/+138
| | | | | | | | | | | | | | | | | | Place id and name testing in its separate section, test_id_api(). Add a test that actually allocates an FSM instance with a NULL id, which is allowed, but uncovers a bug of an unset FSM instance name. osmo_fsm_inst_name() falls back to the fsm struct's name on NULL, but osmo_fsm_inst_find_by_name() fails to match if the instance's name is NULL (and until recently even crashed). Show this in fsm_test.c with loud comments. Add test to clear the id by passing NULL. Add test for setting an empty id. Add test for setting an invalid identifier (osmo_identifier_valid() == false). Change-Id: I646ed918576ce196c395dc5f42a1507c52ace2c5
* fsm_test: terminate the main loop instead of exit on timeoutNeels Hofmeyr2018-04-092-2/+7
| | | | | | | | | | | | | | | In fsm_test.c, we have FSM instance cleanup after the select main loop, but we exit(0) in the timer cb; hence the final code is never called. Rather clean up the instance and hence also test that, by using a global flag to exit the main loop upon timeout. Adjust expected stderr output. BTW, in a subsequent commit, I want to move the fsm instance id testing to below the main loop, to more clearly group the tested bits. Change-Id: Ia47811ffcc1bd68d2630c86be7ab98fc1f338773
* fsm: Update the name as well if the id is updated and accept NULLDaniel Willmann2018-03-191-0/+4
| | | | | | | | | If the name stays the same the log messages will still log with the old id. Since we can now change the id we need to update the name as well. NULL as id was allowed before so we should allow that as well. Change-Id: I6b01eb10b8a05fee3e4a5cdefdcf3ce9f79545b4
* print BIG FAT ERROR message if osmo_fsm lacks event namesStefan Sperling2018-02-262-4/+11
| | | | | | | | | | | | Event names are displayed in VTY commands so all FSM should have them. Print an error message if an FSM is registered without event names. We could also return an error code, however at present no caller checks the return value of osmo_fsm_register() so this would be pointless. Add event names to the test FSM and update expected output accordingly. Change-Id: I08b100d62b5c50bf025ef87d31ea39072539cf37 Related: OS#3008
* fsm_test.c: fix unreachable checkVadim Yanitskiy2017-05-151-1/+3
| | | | Change-Id: Ic3d5da00f7ece6dbcd4c999187a5748c9331e60f
* control_if: Add control interface commands for FSMsHarald Welte2017-04-272-11/+45
| | | | | | | | | This allows programmatic access to introspection of FSM instances, which is quite handy from e.g. external test cases: Send a message to the code, then use the CTRL interface to check if that message has triggered the right kind of state transition. Change-Id: I0f80340ee9c61c88962fdd6764a6098a844d0d1e
* osmo_fsm: Lookup functions to find FSM Instance by name or IDHarald Welte2017-04-162-9/+15
| | | | | | | Introduce two lookup helper functions to resolve a fsm_instance based on the FSM and name or ID. Also, add related test cases. Change-Id: I707f3ed2795c28a924e64adc612d378c21baa815
* fsm_test.c: fix compiler warning: timer cb return typeNeels Hofmeyr2016-12-241-1/+1
| | | | Change-Id: Ifd7e85cd69b5e7e473000abc1ef7a56748aafc0e
* fsm: add LOGPFSML to pass explicit logging levelNeels Hofmeyr2016-12-141-1/+1
| | | | | | | | | | | | | | Provide one central LOGPFSML to print FSM information, take the FSM logging subsystem from the FSM instance but use an explicitly provided log level instead of the FSM's default level. Use to replace some, essentially, duplications of the LOGPFSM macro. In effect, the fsm_test's expected error changes, since the previous code dup for logging events used round braces to indicate the fi's state, while the central macro uses curly braces. Change-Id: If295fdabb3f31a0fd9490d1e0df57794c75ae547
* Add logging and testing for FSM deallocationMax2016-11-081-4/+5
| | | | | | | | osmo_fsm_inst_alloc() logs allocation but osmo_fsm_inst_free() is silent. Fix this by adding log message for deallocation to make FSM lifecycle tracking easier. Also make sure it's covered by test suite. Change-Id: I7e5b55a1fff8e36cf61c7fb61d3e79c1f00e29d2
* Add osmo_fsm_unregister() to headerMax2016-11-021-0/+1
| | | | | | | Previously function was defined but not exposed so there were a way to register FSM but no way to unregister it. Change-Id: I2e749d896009784b77d6d5952fcc38e1c131db2b
* Add Finite State Machine abstraction codeHarald Welte2016-06-163-0/+166
This code is supposed to formalize some of the state machine handling in Osmocom code. Change-Id: I0b0965a912598c1f6b84042a99fea9d522642466 Reviewed-on: https://gerrit.osmocom.org/163 Tested-by: Jenkins Builder Reviewed-by: Harald Welte <laforge@gnumonks.org>