Note: lists.cip-project.org will be down for maintenance on Monday, September 26th, starting at 9AM Pacific Time (4PM Monday September 26, 2022 UTC), for approximately one hour.
Nicholas Mc Guire <der.herr@...>
On Thu, Nov 22, 2018 at 01:19:44PM +0000, Paul Sherwood wrote:
Hi Clive,61508 starts out with the top layer wich is actually technology
agnostic - simply put if we do not understand the system then it
can´t be adequately safe - so part 1 does not talk about HW/SW at all
but about context/scope/hazard-analysis/mitigation-allocation...
independent of technological issues. Part 2 then looks at the
technical system (and not just HW) with respect to systematic and
random deviations from specifications as derived by applying part 1
part 3 then looks at the specifics of software. So the layering
of 61508 is a very abstract process layering to ensure that the
poetntial high-level faults - non-understanding expressed in requirements
and design faults - we do not kill that many people wiht dereferenced
NULL pointers - atleast not repetedly if the high-level processes work
are addressed at all levels. see e.g. HSE HSG238
In addition there may be technical "layering" as in layers of protection
and architectural measures - but thats already at the implementation
The simplest argument is that the goal of any safety process is that theThe topmostOK, understood. In previous discussions I've been attempting to understand
safety functional requirements are implemented in the software elements - the
outlined process (route 1_S) is one way seen suitable to achieve the objectives
of the safety standards (61508 and derivatives - DO 178 is a bit different
because the context of ARP 4751A is well defined - so they can put very specific
needs into DO 178/254 while 61508 is a generic standard and can´t do that.
Essentially the goal of achieving the objectives is not dependent on the
process by which the implementation is achieved - verification of the
achieving of the objectives *may* though depend on the means by which
the imlementation wsa achieved (but then again is quite independent of
the question of "intent for use in safety related systems" or not).
DO 178B (respectively DO 248) - but that misses the essential point ofFrom the software perspective, under this is the requirement to showOK, makes sense.
* showing that the assurance data (often process data) on which we
based any such claim is adequate
and this is the thing that is changing because the two highlevel requirements
you give are fully adequate for deterministic and relatively simple
system (type-A systems in 61508-2 Ed 2 220.127.116.11.2) but not for type B
system because we generally can´t demonstrate correctness nor compleness
in any meaningful sense - with other words we increasingly simply do not
know what the "right thing" is and as soon as non-determinism comes
into play the "built the thing right" becomes a probability as well and
needs to be assessed as such (e.g. a pLRUt cache replacement in many
current CPU does not allow to claim that it is built right other than
..we fell off that horse about 20 years ago but many did not notice ;)
The point is to accept what has been stated many time alreday that
safety is not a 100% property anyway - as long as system were simple
we could entertain the illusion of completeness of testing (an absurd
assertion since the mid 1990s for many systems) and we have not yet
fully developed the necessary understanding and tools to actually
handle complex systems. Also note that this idea of correctness is
bound to strongly to technical realisation which puts the focus on
mitigation of faults rather than the elimination of faults at the
requirements and design level - and that is really why we are so lost
with current safety standards when it comes to complex systems because
we emediately jump to mitigation rather than harvesting the potential
for elimination first - with other words the problem is systems engineering
not software engineering.
...and who ever had a fault free initial specification to startIf you are doing full formal semantic verification, the second step is
with for her formal specificaiton that then was shown to be implemented ?
the idea that "everything in the system matches the requirments" and
"every requirement is built into the system" - kind of the corrolary to
your two components above - does not address the key issue in fucntional
safety and that is that our reqiurements are wrong because we do not
fully understand the system and its environment (except for the most trivial
...and the misunderstanding of the systems intent by those writing
the formal specification that then is proven - it is interesting to note
that 61508 Ed 2 (Table A.1) ranks formal requiremets specification
lower than semi-formal requirements specification and in Table C.1 it
is clarified why - reduced understandability !
it is not based on testing - no sane safety standard would suggest toHowever, in practice formal proof is
achieve verification by testing - it is always analysis and testing
and if it is reduced to testing only then it will for sure produce a
warm cosy fealing after execution of 100k test-csae... which covered
10E-20% of the systems state-space.
The problem of testing is that in the heads of many we sitll have the
idea that an aggregation of highly-reliable components forms a highly
reliable system - which is wrong in it self but becomes a real hazard
as soon as the ability to inspec comonents is so much easier that we
focus on components because we can believe that we understand then
in isolation and then simply drop the main cause which is interaction
(which is in genreal not covered by testing - not even integratoin
testing - maybe to a limited level by field trials)
Does anyone have hard evidence that shows that there is *any*
significant correlation between MISRA C coding rules and bug rates ?
This is one of the cases where we focus on formality because we can
even though we have little (or no) evidence that these rules or metrics
have any effect (aside from them being used in a way that they
were never intended for anyway) ?
as a corrolary think about your personal driving experience - how many#
situations were you in where you got out by violating a rule ?
the assumption that following context agnostic rules leads to
safety properties of system is truely absurd.
OK, IIUC this is mainlyIf you have no specific context how can you assert more than correctness against
context free reqiuremnts which them selve have no assurance of correctness or
comleteness in the context of any specific system - focussing on what we can
because we know that we can´t handle the level that actually is
relevant is a form of deliberate ignorance.
Coding standards (and this is the intent of the Linux kernel coding standard)
lead to *readability* which is the maybe only relevant defence
against correct implementation of the wrong function (or as you
state above not "building the right system"). That is the expressed
intent of the Linux kernel coding standard and readability respectively
understandability of code (and fault behavior) is the key to actually
being able to detect when the correctly implemented code is the wrong
solution for a particular context - the requirements don´t do as they are
an abstraction and as such they focus on the intended behavior not on the
side effects or unintended interactions - thus matching only requirements
of perceived generic elements will necessarily lead to missing the specific
intent for any system in the systems specific corner cases.
No - thats precisely what only is true for very simple components - butTo get back to the original discussion, it is staggeringly naive toI've also seen that (in part that's why I'm here) but I certainly wouldn't
never holds for complex components and any OS is a type-B system
a) the failure mode of at least one constituent component is not well defined; or
b) the behaviour of the element under fault conditions cannot be completely determined; or
c) there is insufficient dependable failure data to support claims for rates of failure for
detected and undetected dangerous failures (see 18.104.22.168 to 22.214.171.124).
[IEC 61508-2 Ed 2 126.96.36.199.3]
Pre-certified OS (or complex libraries) buy you the illusion that you
took care of safety by giving someone else enough money - thats it.
good to hear that - they should not be - because it depends entirely on the- its a level of detail your project doesn't have to look into (thoughOK in principle. However from some of the discussion, which I won't rehash
specific context - the higher the complexity of a system the more we depend
on looking at the right corners of the system to undrestand
where they can go wrong - focussing on generic properties (unspecific behaviors
and their correctness asserted against a more or less random model) gives
you very little. The higher the complexity of a system the more the abiltiy
to analyze the systems specific behavior in context of env/Use-case will determine
the systems safety properties. Even *if* testing could achieve the
initial goal of correctness the inability to analyze the system would
impair any effort to understand and thus learn from incidences.
prior usage may well be one building block in a chain of assessmentAs I read your original message you are asking 'why can't a wide userIf that's the question I seemed to be asking I apologise; certainly I
of a pre-existing element but I would claim primarily in the sense
of selecting the lowest risk elements - it will not save you any effort in
assessing the objectives of functional safety - but careful selectoin
based on prior usage will increas the likelyhood that the assessment
will actually conclude positively. To the specifics of 61508 Ed 2
route 3_S (assessment of non-compliant development) the relevance of
a large user base is also the ability to actually harvest process level
data that can allow to assess the effectiveness of different measures
e.g. it is trivial to state that a pre-existing element wsa reviewed but
without any data on finding, peoples competency, level of deviations later
found during operations, etc. we can not actually use "review was done"
as an argument in the assessment of a non-compliant development - and in
this sense user base is as you say "evidence of something" the trick
is to find sound procedures how to extract the relevant information in that
something so as to be able to make a statement on the process that created
the element. So as soon as you shift the focus from the implementation
details to the process that created these implementation details
then the user base becomes the key "data set" that allows to actually build
an argument - at least that is the assumption behind the SIL2LinuxMP project.
with an important change - the use of pre-existing elements alwaysThe short answer is, doI totally agree - which brings us back to the need for required/architected
implies that you are building functionality into the system that
does NOT match you needs exactly and the mitigation again only
lies in the ability to analyze the system to the point where the system can
ither be adjusted to the specifics of the element (by updating requirements
and design) or by handling the discrepency at runtime (e.g. wrappers)
in a complex system it is highly unlikely that the requirements anyone put
on a complex element like an OS is in exact alignment with any particular
system - not even POSIX 1003.13 PSE 51 matchies any real system 100%.
Thats the prime felacy I see in the whole pre-existing SW discussionIs my project about to use someMakes sense. Thanks again Clive!
that the focus on fuctionality - the argument for using the common
setup is that the process initially was generating this common
setup and the measures and techniques to achieve the specfied
behavior where IN CONTEXT of the common use-case no mater if explicidly
stated or implied - diverting from the common use-case potentially
invalidates the results of these measures and techniques. So the
requirement to be allowed to draw on any process level claims of the
pre-existing element is to operate it in as close a context to the
original intent as possible - using common configurations is one
aprt of this.