--- abstract: "Developers of fault-tolerant distributed systems need to guarantee that fault tolerance mechanisms they\r\nbuild are in themselves reliable. Otherwise, these mechanisms might in the end negatively affect overall\r\nsystem dependability, thus defeating the purpose of introducing fault tolerance into the system. To\r\nachieve the desired levels of reliability, mechanisms for detecting and handling errors should be developed\r\nrigorously or formally. We present an approach to modeling and verifying fault-tolerant distributed\r\nsystems that use exception handling as the main fault tolerance mechanism. In the proposed approach, a\r\nformal model is employed to specify the structure of a system in terms of cooperating participants that\r\nhandle exceptions in a coordinated manner, and coordinated atomic actions serve as representatives of\r\nmechanisms for exception handling in concurrent systems. We validate the approach through two case\r\nstudies: (i) a system responsible for managing a production cell, and (ii) a medical control system. In both\r\nsystems, the proposed approach has helped us to uncover design faults in the form of implicit assumptions\r\nand omissions in the original specifications." accompaniment: [] book_title: ~ commentary: ~ completion_time: ~ composition_type: ~ conductors_id: [] conductors_name: [] contact_email: ~ copyright_holders: [] corp_creators: [] creators_id: - ~ - alexander.romanovsky@ncl.ac.uk - ~ creators_name: - family: Castor Filho given: Fernando honourific: '' lineage: '' - family: Romanovsky given: Alexander honourific: '' lineage: '' - family: Rubira given: Cecilia honourific: '' lineage: '' data_type: ~ date: 2009 date_type: published datestamp: 2009-03-25 17:08:38 department: ~ dir: disk0/00/00/00/88 divisions: [] edit_lock_since: ~ edit_lock_until: ~ edit_lock_user: ~ editors_id: [] editors_name: [] eprint_status: archive eprintid: 88 event_dates: ~ event_location: ~ event_title: ~ event_type: ~ exhibitors_id: [] exhibitors_name: [] fileinfo: /style/images/fileicons/application_pdf.png;/88/1/fernando%2Djss%2D2009.pdf full_text_status: public funders: [] id_number: ~ importid: ~ institution: ~ isbn: ~ ispublished: pub issn: ~ item_issues_comment: [] item_issues_count: 0 item_issues_description: [] item_issues_id: [] item_issues_reported_by: [] item_issues_resolved_by: [] item_issues_status: [] item_issues_timestamp: [] item_issues_type: [] keywords: ~ lastmod: 2010-04-19 15:05:53 latitude: ~ learning_level: ~ longitude: ~ lyricists_id: [] lyricists_name: [] metadata_visibility: show monograph_type: ~ note: ~ num_pieces: ~ number: ~ official_url: ~ output_media: ~ pagerange: 874-890 pages: ~ patent_applicant: ~ pedagogic_type: ~ place_of_pub: ~ pres_type: ~ producers_id: [] producers_name: [] projects: [] publication: The Journal of Systems and Software publisher: ~ refereed: TRUE referencetext: ~ related_url_type: [] related_url_url: [] relation_type: [] relation_uri: [] rev_number: 17 series: ~ skill_areas: [] source: ~ status_changed: 2009-03-25 17:08:38 subjects: - deploy_method_resil - deploy_method_proof succeeds: ~ suggestions: ~ sword_depositor: ~ sword_slug: ~ task_purpose: ~ thesis_type: ~ title: Improving reliability of cooperative concurrent systems with exception flow analysis type: article userid: 7 volume: 82