Preprint
Review

This version is not peer-reviewed.

Class 2 Dangerous Goods Off-Gassing as an Early Warning Signal for Lithium-Ion Battery Thermal Runaway in Aircraft Cargo Holds: A Critical Review and Decision Framework for Gas-Phase Sensor Selection

Submitted:

01 June 2026

Posted:

02 June 2026

You are already at the latest version

Abstract
Air transport of lithium-ion batteries has grown faster than the fire-protection systems designed to contain them. The United States Federal Aviation Administration verified a record 89 battery thermal events aboard commercial aircraft in 2024, representing a sixteen percent rise on the previous year, and an independent airline reporting programme recorded a forty percent increase in cargo-side incidents between 2021 and 2025. When a cell fails, it vents for several minutes before producing the visible smoke that current photoelectric detectors are built to catch. That vent gas is not a vague hazard but a chemically defined mixture of hydrogen, carbon monoxide, carbon dioxide and light hydrocarbons, almost all of which falls under Class 2 of the United Nations dangerous goods scheme. This review treats that correspondence as a deliberate design starting point, positioning the Class 2 taxonomy as a sensor architecture input rather than a filing category. The experimental literature on vent gas composition is synthesised and read against the operational record of aviation incidents. An Analytic Hierarchy Process and TOPSIS decision model is then constructed to rank five candidate sensor families—electrochemical (EC), non-dispersive infrared (NDIR), tunable diode laser absorption spectroscopy (TDLAS), metal-oxide semiconductor (MOX) and photoionisation (PID)—against seven criteria covering detection limit, response time, selectivity, flight-envelope tolerance, certification maturity, power draw and cost. The electrochemical sensor ranked first (TOPSIS closeness coefficient C* = 0.741), followed by non-dispersive infrared (C* = 0.635) and tunable diode laser spectroscopy (C* = 0.586). Robustness checks confirmed that the top-three order holds under all twenty-percent single-weight perturbations and across three policy scenarios. Because no single technology covers the full Class 2 envelope, a combined architecture is recommended: an electrochemical hydrogen channel, a non-dispersive infrared channel for the carbon oxides, and a metal-oxide array for hydrocarbon classification and redundancy. This combination aligns with the chemistry-specific findings of an independent principal component analysis of 247 reported failure cases. The review closes with concrete regulatory proposals for the ICAO Technical Instructions, the IATA Dangerous Goods Regulations and the EASA certification basis.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  

1. Introduction

The number of lithium-ion cells moving through the air freight and passenger systems each year now runs into the billions. They power phones, laptops, medical devices, e-cigarettes and the power banks that travellers carry without a second thought. The property that makes them useful, a high energy density packed into a small mass, is the same property that makes a failed cell dangerous. A single 18650 cell at full charge can release tens of kilojoules during a runaway event, together with several litres of flammable and toxic gas [1,2]. Inside an aircraft cargo hold, the outcome of such an event depends far less on the cell itself than on how quickly the failure is detected.
The incident count has risen steadily. The Federal Aviation Administration verified 89 lithium battery incidents involving smoke, fire or extreme heat aboard commercial aircraft in 2024. This was a sixteen percent increase over 2023 and the highest annual count the agency has recorded [3,4]. Seventy-seven of those events happened on passenger aircraft and twelve on cargo aircraft. By the end of June 2025 the agency had already logged 38 more, which puts the year on track for another record [5]. A separate dataset assembled by Underwriters Laboratories Standards and Engagement, drawing on voluntary reports from 37 airlines, found a forty percent rise in cargo aviation thermal events across the 2021 to 2025 window, with 65 cargo incidents in total [6,7]. Power banks and spare batteries make up a large share of the device-level breakdown.
Two accidents remain central to the regulatory record. United Parcel Service Flight 6, a Boeing 747 freighter carrying roughly 81,000 lithium-ion cells, was lost near Dubai in September 2010 after an in-flight cargo fire [8]. Asiana Airlines Flight 991, also a 747 freighter, went down off Jeju in July 2011 with dangerous goods including lithium-ion batteries aboard [9]. In January 2025, a power bank in an overhead bin ignited on a parked Air Busan aircraft. No one was killed, but the airframe was destroyed. The event was spotted by passengers and crew, not by any installed detector [6]. The regulatory response to these failures has been steady tightening of the ICAO Technical Instructions and the IATA Dangerous Goods Regulations, including the ban on lithium-ion cells as cargo on passenger aircraft and a thirty percent state-of-charge cap for cargo shipments [10,11]. Neither measure addresses the detection problem inside the hold.
Cargo compartment fire detection still rests almost entirely on photoelectric smoke sensors, which respond to scattered light from solid combustion particles. The trouble is that a lithium-ion cell in the early stage of failure does not produce much in the way of solid smoke. It produces gas. Work by the FAA Fire Safety Branch and by several university groups has shown that a failing cell vents hydrogen, carbon monoxide, methane, ethane, ethylene, propylene and traces of hydrogen fluoride for a period that can run from seconds to several minutes before the smoke threshold is reached [12,13]. A detector tuned to the gas rather than to the smoke could therefore activate well before the suppression system reaches its effective limit.
This connection has received little attention in the literature. The vent gas inventory maps almost cleanly onto Class 2 of the United Nations Recommendations on the Transport of Dangerous Goods. Hydrogen and the light hydrocarbons are Class 2.1 flammable gases. Carbon dioxide is Class 2.2. Carbon monoxide is Class 2.3 toxic gas [14]. The same scheme that a dangerous goods officer uses to classify a shipment also describes, almost item for item, what that shipment will emit if it fails. It means that the regulatory language already in use across the freight system can serve as the organising principle for a detection architecture—something neither the battery safety literature nor the aerospace fire-protection literature has previously made explicit.
Recent literature has developed on two fronts. Bugryniec and colleagues published a thorough meta-analysis of vent gas composition in 2024, pulling the experimental data together on a per-watt-hour basis and settling several questions about how chemistry and state of charge shape the gas mixture [15]. On the sensing side, reviews by Han and colleagues, by Teng and Lv, and by Zhang and colleagues have surveyed the resistive, electrochemical and optical technologies suited to early warning [16,17,18], while Wang and colleagues used a principal component analysis of 247 reported failures to derive chemistry-specific sensor advice [19]. No published work applies these consolidated findings specifically to the aircraft cargo hold, where the constraints are chemical, regulatory and economic in equal measure, and where any proposed device must satisfy the certification basis of CS-25 and 14 CFR Part 25 before it can fly.
This review addresses that gap. Three questions guide the analysis. What is the Class 2 vent gas profile that a cargo-hold sensor must detect, and how does it shift across the device categories that dominate the incident record? What does the operational record show about the interval between first venting and smoke-alarm activation? And given the constraints of the flight envelope, the certification regime and the cost structure of commercial aviation, which sensor technologies—and which combinations—offer the most defensible early warning design?
The contribution is specific in three respects. The Class 2 taxonomy is used explicitly as a design driver for sensor selection—a step neither the off-gas reviews nor the fire-protection literature has previously taken. A multi-criteria decision protocol is built and documented in full, so the weighting and ranking can be audited and updated. And a set of concrete regulatory openings is identified where gas-phase detection could enter the ICAO, IATA and EASA frameworks without disturbing the established classification system.
The paper is organised as follows. Section 2 establishes the theoretical and technical foundations. Section 3 details the literature assembly and decision model. Section 4 presents the synthesis and sensor ranking. Section 5 discusses regulatory implications, open engineering questions and limitations. Section 6 draws conclusions.

2. Theoretical Framing and Technical Background

2.1. A Safety-Science Lens: Normal Accidents and High-Reliability Organisations

A detection gap measured in minutes may appear modest in isolation; two bodies of safety theory demonstrate why it is not. Perrow's normal accident theory argues that in systems that are both tightly coupled and interactively complex, failures are not aberrations but expected features of the system's design [20]. An aircraft cargo hold is exactly such a system. The cargo is dense and varied, the suppression agent has a fixed and limited capacity, the crew cannot reach the fire in flight, and the time available to divert and land is bounded by fuel and geography. A lithium-ion runaway is a tightly coupled failure: once the first cell vents, propagation to neighbouring cells can outrun the suppression envelope, and there is no slack in the system to absorb the delay.
High-reliability organisation theory supplies the other side. Organisations that operate hazardous technologies with very low failure rates, such as carrier flight decks and air traffic control, share a preoccupation with early detection of weak signals and a refusal to let small anomalies pass unexamined [21]. Applied here, the lesson is direct. The off-gas phase is a weak signal that the current detection architecture is not built to hear. Reading that signal early, and acting on it, is precisely the kind of anticipatory practice that distinguishes a high-reliability system from one that merely waits for the alarm. The technology-organisation-environment framework rounds this out by reminding us that whether a detection technology is actually adopted depends not only on its physics but on the certification regime and the operating context it has to fit [22]. Together these three frameworks shape the structure of what follows: the chemistry defines the signal, the operational record establishes the gap, and the decision model weighs sensor options against constraints that are organisational and regulatory as much as technical.

2.2. How a Lithium-Ion Cell Fails and What It Releases

Thermal runaway is a self-feeding exothermic breakdown that begins when a cell generates heat faster than it can shed it. The accepted picture, built largely on the experimental work of Wang, Feng and their collaborators, runs through four stages [1,23]. First the solid electrolyte interphase decomposes, somewhere between about 80 and 130 degrees Celsius, and the anode begins to react with the electrolyte, giving off hydrogen and light hydrocarbons. Then the separator melts and internal short circuits start. Next the cathode breaks down and releases oxygen, with the amount depending strongly on chemistry: nickel-rich oxides such as NCA and high-nickel NMC give up oxygen more readily than lithium iron phosphate. Finally the vented gases and electrolyte vapour ignite, often outside the cell casing. The first venting—when the safety valve opens and carbonate-rich gas escapes before ignition—can precede full runaway by a useful margin [15,24]. That first venting is the signal worth catching.
The composition of the vent gas has been measured often enough that the 2024 Bugryniec meta-analysis could settle the broad picture [15]. Golubkov and colleagues had earlier established baselines for 18650 cells, reporting hydrogen at thirty to fifty percent of the mixture by mole fraction for high-nickel cathodes, the carbon oxides together at another thirty to forty percent, and the hydrocarbons making up the rest [25,26]. State of charge drives the total volume up in a roughly linear way; FAA Fire Safety Branch tests under a standardised method confirmed that both the gas volume and the combustion energy scale with state of charge, with carbon dioxide, hydrogen and carbon monoxide turning up as the three largest constituents by volume [12,13]. Larsson and colleagues documented hydrogen fluoride at twenty to two hundred milligrams per watt-hour, a figure that matters less for detection than for crew exposure and avionics corrosion afterward [27]. Table 1 sets the inventory against the dangerous goods scheme.
Three observations follow from Table 1. The mixture spans three of the four Class 2 divisions, so a detector built around one analyte will always have a blind spot. The flammable components have low flammability limits, and in a sealed unit load device with little ventilation the early off-gas can build toward those limits faster than intuition suggests. The toxic species, mainly carbon monoxide and hydrogen fluoride, will not reach immediately dangerous levels in the bulk hold air during a single venting, but they pool in local pockets and pose a real hazard to anyone opening the hold afterward. Chemistry shifts the balance: lithium iron phosphate tends to give more hydrogen and hydrocarbon per unit capacity, while the nickel-rich oxides produce more carbon oxides at higher peak temperature [15,24,28].

2.3. The Cargo Hold and What Currently Watches It

Part 25.857 of the Federal Aviation Regulations sorts cargo compartments by how they are detected, suppressed and accessed [29]. Class C compartments, on most wide-body aircraft, have smoke detection, a built-in suppression system that was historically Halon 1301 and is increasingly a halocarbon replacement, and little or no crew access in flight. Class E compartments, found on dedicated freighters such as the 747F and 777F, rely on detection and on starving the fire of oxygen through controlled depressurisation rather than on a discharged agent. Both classes were certified on the assumption that the fire would come from ordinary cargo burning, not from an electrochemical device that supplies its own oxygen as it fails.
The detector in both cases is the photoelectric smoke unit, set to trigger near a fraction of a percent obscuration per foot, usually with a confirmation rule to keep false alarms down. For ordinary cargo this works. For a lithium-ion failure it lags, because the early phase makes little visible aerosol and the smoke only builds once the gases ignite or the next cell goes. By the time the alarm sounds, the Halon may already be losing the fight. FAA Technical Center work showed that Halon 1301 at the standard concentration cannot indefinitely hold down a multi-cell runaway, since the cathode keeps supplying oxygen from inside [30], and that vent gas can ignite after the agent has been discharged, with pressure consequences the original certification never contemplated [31].
Three separate lines of evidence point the same way. Sensor laboratories have repeatedly caught hydrogen and carbon monoxide at parts-per-million levels within tens of seconds of a cell venting [16,17,32]. Battery-storage research has shown gas detection beating surface-temperature monitoring on speed [33]. And the United Nations Global Technical Regulation No. 20 for electric vehicles now calls for gas-based warning at least five minutes ahead of a serious event, which is a notable institutional endorsement of the principle [34]. The open problem is not whether gas detection works in a laboratory. It is how to make it work inside a certified cargo hold.

2.4. The Regulatory Frame

Four instruments shape this space. The ICAO Technical Instructions are the binding international rule set [10]. The IATA Dangerous Goods Regulations, now in their 67th edition for 2026, restate ICAO with extra operational detail [11]. The FAA adds Special Federal Aviation Regulations, Advisory Circulars and Safety Alerts for Operators [35]. The EASA adds the CS-25 certification specifications and a run of Safety Information Bulletins [36,37]. For lithium-ion cells the core provisions sit in Packing Instructions 965 to 970, which separate cells shipped alone (UN 3480) from those packed with or inside equipment (UN 3481). The passenger-aircraft cargo ban and the thirty percent state-of-charge cap for cargo shipments have been in force since 2016 [10,11]. These measures lower the hazard without removing it, since a cell at thirty percent charge can still propagate and field compliance is only partial [4,7]. The framework is built around prevention by classification and packaging. An in-flight gas-phase detection layer is not required, even though it sits comfortably inside the ICAO Annex 18 safety-management philosophy. Section 5 returns to that opening.

3. Scope and Analytical Approach

This is a critical review rather than a systematic review, and it does not claim the exhaustive, protocol-bound coverage that the latter implies. The aim is to synthesise what is known, read it against the operational record, and turn that synthesis into a usable decision tool. The work proceeds in three connected layers, described below.

3.1. How the Literature Was Assembled

The evidence base was built from three streams. The first is the experimental and review literature on vent gas composition and on gas-phase sensing, gathered from Scopus, Web of Science, IEEE Xplore, ScienceDirect and PubMed, with the 2024 Bugryniec meta-analysis serving as the anchor for composition data and the recent sensing reviews serving as the anchor for detector performance [15,16,17,18,19]. Because that composition meta-analysis already exists and is recent, there was no value in repeating it; this review leans on it and concentrates instead on the aviation-specific reading. The search favoured work published between 2010, the year of the UPS Flight 6 loss, and early 2026, with the bulk of the sensing material concentrated in the last three years.
The second stream is the operational record. Four public sources were read together: the FAA register of lithium battery incidents involving smoke, fire, extreme heat or explosion, which now lists more than 650 verified events since 2006 [3]; the UL Standards and Engagement Thermal Runaway Incident Program, which aggregates voluntary reports from 37 airlines and reaches the ground-handling and ramp events that fall outside mandatory FAA reporting [6,7]; the NASA Aviation Safety Reporting System for crew narratives [38]; and the EASA annual safety reviews together with the ICAO ADREP system for investigated occurrences [37,39]. These were used to characterise how events unfold and, where the record allowed, how detection actually happened.
The third stream is the regulatory and certification corpus: the ICAO Technical Instructions, the IATA Dangerous Goods Regulations, the relevant FAA Advisory Circulars and Technical Center reports, and the EASA CS-25 material [10,11,29,30,35,36]. This stream supplied the constraints that any sensor proposal has to satisfy, and it grounds the policy discussion in Section 5.

3.2. The Decision Model

The sensor selection problem has the classic shape of a multi-criteria decision: several candidate technologies, several criteria that pull in different directions, and no option that wins on everything. The Analytic Hierarchy Process is well suited to setting the relative weight of the criteria through structured pairwise comparison [40,41], and TOPSIS is a transparent way to rank alternatives once the weights are fixed [42]. The pairing is widely used in transport and energy technology assessment [43,44], and the recent explainable MCDA framework for maritime decarbonisation published in this journal is a close methodological cousin [45]. The choice of AHP-TOPSIS over the principal component approach taken by Wang and colleagues [19] is deliberate: PCA is more compact, but AHP keeps every weighting judgement visible and auditable, which matters more when the output is meant to inform a certification conversation.

3.2.1. Candidate Technologies

Five sensor families were carried into the analysis, chosen to span the practical detection principles available at a technology readiness level of six or higher and validated against the Class 2 species [16,17,18,32,33]. Metal-oxide semiconductor sensors read the change in resistance of a heated oxide film as it meets a reducing gas; they are cheap and sensitive to hydrogen, carbon monoxide and hydrocarbons, and recent noble-metal and heterojunction designs have improved their selectivity, though humidity and baseline drift remain weaknesses. Non-dispersive infrared sensors read the absorption of infrared light at characteristic bands; they handle carbon dioxide, carbon monoxide and hydrocarbons well but cannot see hydrogen, which has no infrared signature. Electrochemical sensors generate a current from a gas-selective reaction at an electrode; they are the industry standard for carbon monoxide and hydrogen at low concentrations, with annual drift and low-humidity sensitivity as the main limits. Photoionisation detectors ionise volatile organics with ultraviolet light; they are strong on unsaturated hydrocarbons but blind to hydrogen and the small alkanes. Tunable diode laser absorption spectroscopy reads narrow-line laser absorption at a chosen wavelength; it offers the best selectivity and the lowest detection limit of the set, at a cost and packaging burden that currently keeps it out of routine aerospace use.

3.2.2. Criteria

Seven criteria were defined in advance, each meant to be readable from datasheets, peer-reviewed evaluations and certification documents rather than from opinion. Table 2 lists them.

3.2.3. Weighting

The criterion weights in this review were derived by triangulation across the published sensor-evaluation literature [16,17,18,19,32] and the FAA Fire Safety Branch findings on detection priority [12,13,30]. This literature-based derivation keeps every judgement traceable to a source. The pairwise comparison matrix, the resulting eigenvector weights and the consistency check are given in full in Appendix A, so a reader can reconstruct or challenge them. A planned extension will test these weights against an independent expert panel under institutional ethics approval; that work is separate from the present review and is noted here only for transparency. The consistency ratio of the matrix used is about 0.03, comfortably under the 0.10 threshold Saaty set [40].

3.2.4. Ranking and Robustness

Once the weights were set, the five technologies were ranked by the standard TOPSIS steps: build the decision matrix, normalise it, weight it, locate the ideal and anti-ideal points, measure each option's distance from both, and compute the closeness coefficient [42]. The decision matrix was populated from datasheet ranges and published evaluations, using midpoint estimates. Robustness was checked two ways. Each weight was varied by plus or minus twenty percent with the others rescaled, and the ranking was recomputed. Three policy scenarios were also tested: a safety-first case that doubles the detection-limit and response-time weights, a cost-first case that doubles the cost weight, and a certification-first case that doubles the certification weight. The detail is in Appendix B.

4. Synthesis and Results

4.1. The Vent Gas Profile a Cargo Sensor Must Catch

Reading the aviation-relevant slice of the off-gas literature against the Bugryniec consolidation gives a consistent picture [12,13,15,24,25,26,27]. For NMC and NCA cells at high charge, hydrogen sits at thirty to fifty percent of the mixture, the carbon oxides together at thirty to forty percent, and the hydrocarbons fill the remaining fifteen to twenty-five percent. Lithium iron phosphate cells at high charge shift the balance toward hydrogen and hydrocarbons and away from the carbon oxides. Below roughly forty percent state of charge the hydrogen fraction drops and the mixture's lower flammability limit rises, which is the quantitative reason the thirty percent shipping cap is more than a bureaucratic line [10,11,13]. Total gas volume rises with capacity and charge, and the FAA standardised method puts it at one to five litres per ampere-hour at standard conditions across the tested cells [13].
Timing is the part that matters most for detection. The first venting, when the valve opens and carbonate-rich gas escapes, can lead full runaway by anything from seconds to minutes depending on how the cell was triggered and how it is built [24,32]. For a sensor that responds in one to ten seconds, that lead time is workable. The real question is whether the device can register the relevant gas at the concentration reached during first venting under the ventilation conditions of a real hold, and that question is what the ranking in Section 4.3 is built to answer.

4.2. What the Incident Record Shows

The merged FAA, ULSE, ASRS and EASA record tells a clear and worsening story. FAA verified counts climbed from the thirty-to-forty range in the early 2010s to 89 in 2024, the highest on record and sixteen percent above 2023 [3,4]. The mid-2025 figure of 38 by the end of June points to another rise [5]. In 2024 about three-quarters of events were on passenger aircraft and a quarter on cargo aircraft, with battery packs alone making up close to forty percent of the device-level total and spare batteries, e-cigarettes and laptops filling out the rest [4]. The UL program, reaching the ground and ramp events that mandatory reporting misses, recorded the forty percent rise in cargo events across 2021 to 2025 [6,7].
The mode of detection recorded in these events is particularly instructive. In a large share of cabin events the first warning comes from a passenger or a crew member who sees or smells something, not from an installed sensor. The Air Busan loss in January 2025 is the clearest recent example: passengers and crew caught it, the airframe was lost anyway [6]. Cargo events are worse in a particular way. No one is in the hold to smell anything, so detection falls entirely on the photoelectric system, and the lag between the start of venting and the alarm is exactly the deficit this review is about [3,6,12,13]. Several of the 2024 cargo-aircraft events were first noticed by ground handlers before loading, which is itself telling: the off-gas signature exists well before any onboard sensor could see it, and an argument can be made for gas detection at the unit load device stage, on the ramp, as well as in the air [4].

4.3. The Sensor Ranking

The weight vector derived in Appendix A ranks detection limit first, followed by certification readiness and response time; selectivity and flight-envelope tolerance occupy the middle positions, and power and cost rank lowest. This ordering follows from the operational context: early detection and certification eligibility are primary constraints. Sensor cost is negligible relative to airframe value. Table 3 gives the decision matrix and Table 4 the resulting ranking.
The electrochemical sensor leads, carried by a very low detection limit, very low power draw and high certification maturity. Non-dispersive infrared follows, strong on certification and selectivity even though its detection limit is higher. Tunable diode laser spectroscopy comes third: its detection limit and response time are the best in the set, but cost and a lower readiness level pull it down. Metal-oxide ranks fourth, where its cost advantage cannot make up for weaker selectivity, and the photoionisation detector ranks last, mainly because it cannot see hydrogen, which is the single most diagnostic early gas.
Robustness checks are as informative as the base-case ranking. The top-three order holds under every twenty-percent single-weight perturbation. In the safety-first scenario, tunable diode spectroscopy climbs past infrared into second on the strength of its detection limit and speed, while the electrochemical sensor keeps first. In the certification-first scenario the electrochemical and infrared positions harden and the laser option falls further back. In the cost-first scenario metal-oxide overtakes the laser option for third. No scenario produces a single technology that wins across all priority structures, and that is the finding that drives the recommendation.
Because no one device covers the Class 2 envelope, the defensible design is a combination. An electrochemical hydrogen channel gives the earliest and lowest-limit signal of the initial breakdown. A non-dispersive infrared channel covers carbon monoxide and carbon dioxide with the certification maturity needed to fly soon. A small metal-oxide array adds hydrocarbon classification and redundancy at low marginal cost. The three can sit together at the unit load device or spread through the compartment, fused by a simple weighted-vote logic. This conclusion lines up with the chemistry-specific advice Wang and colleagues reached from a different method, a 247-case principal component analysis [19], which provides independent support that the recommendation is not an artefact of the weighting choices made here.

5. Discussion

5.1. The Class 2 Taxonomy as a Design Tool

The argument made here has not, to the author's knowledge, been stated plainly before: the dangerous goods classification scheme is a sensor design specification in disguise. The Class 2 divisions are not arbitrary filing labels. They encode flammability, asphyxiation and toxicity, which are exactly the properties that set both the hazard of a vented mixture and the detection job that follows. Build the sensor architecture around the Class 2 divisions and it covers the hazard pathways while staying legible to the people who already work in that language: the dangerous goods officers, the freight forwarders, the certification authorities. The compliance function and the fire-protection function—previously housed in separate regulatory silos—would rest on a shared analytical foundation.

5.2. What This Implies for ICAO, IATA and EASA

Three near-term regulatory actions are identifiable. The IATA Dangerous Goods Regulations could add a non-binding guidance section on cargo-hold gas monitoring, giving operators a reference architecture built on the combined design above [11]. The EASA could open a notice of proposed amendment on the certification basis for gas-phase fire detection, building on CS-25.857 and on its own recent work on detecting lithium batteries with screening equipment [36,37,46]. And ICAO could recognise gas-phase detection as a compensating measure under the Annex 18 safety-management framework. The forthcoming UL 5810 standard on active fire protection for air cargo containers offers a route through which the combined architecture could enter common use ahead of binding regulation [7].
A second regulatory implication deserves care. The thirty percent state-of-charge cap rests on the empirical fact that vent volume and flammability move unfavourably above about forty percent charge [10,11,13]. This review reinforces that basis and supports keeping the cap while there is no independent in-flight detection layer. If such a layer were deployed and proven, a case could be made for revisiting the cap under specific operator-approved conditions, but that is a conversation for after the detection problem is solved, not before.

5.3. Engineering Problems Left Open

Three engineering questions remain unsettled inside the scope of this framework. Where the sensors sit in a unit load device is constrained by loading geometry, by the airflow pattern inside a Class C compartment, and by the certification basis for electronics inside a container. Calibration drift, worst for metal-oxide sensors at the ten to fifteen percent annual range, calls for either periodic recalibration or compensating algorithms, and the certification status of algorithmic compensation in aerospace is not yet settled. And the power and data architecture for a distributed sensor network across many containers is non-trivial, with passive RFID-tagged sensors, autonomous battery nodes and energy-harvesting designs each carrying different certification burdens. The framework gives the criteria against which these options can be judged; it does not pick among them.

5.4. Limitations

Several limits deserve a plain statement. The synthesis builds on the Bugryniec meta-analysis, which is efficient but inherits whatever biases sit in the underlying experiments, particularly the heavy weighting toward 18650 cells and the thinness of large-format pouch and prismatic data [15]. The incident record depends on voluntary and mandatory reporting, both of which under-count, so part of the rising trend may be better reporting rather than more events. The weights come from a literature synthesis rather than independent elicitation, which is why the expert-panel extension is planned. The TOPSIS model assumes the criteria are independent, while detection limit and selectivity plainly interact, so an analytic network process extension is a reasonable next step. And the framework treats the cargo hold as a fixed set of constraints, when real holds differ in airflow, agent and instrumentation across operators.

5.5. What to Study Next

Four directions follow directly. The first is validation in a controlled fire-test laboratory, with the combined architecture instrumented around a representative container and a calibrated runaway source. The second is extension to other Class 2 cargo, including compressed gas cylinders and chemical oxygen generators, which share part of the detection problem. The third is machine-learning classification on the fused multi-sensor time series, building on cycle-level demonstrations in adjacent sensing work [47]. The fourth is regulatory-science research on the certification pathway itself, including standard test scenarios that authorities could use to compare competing architectures on equal terms.

6. Conclusions

A failing lithium-ion cell emits a chemically defined gas mixture that the Class 2 dangerous goods scheme already names. The cargo hold, however, is instrumented for smoke. The gap between the start of venting and the smoke alarm is measurable, and it is the gap in which an early warning system would do its work. This review has read the consolidated chemistry of vent gas against the worsening operational record, and has built a transparent AHP-TOPSIS model to choose among the sensor technologies that could close the gap. No single device covers the full Class 2 envelope. The recommended architecture combines an electrochemical hydrogen channel, a non-dispersive infrared channel for the carbon oxides, and a metal-oxide array for hydrocarbon classification—organised around the Class 2 divisions rather than assembled ad hoc. Treating the dangerous goods taxonomy as a design input—rather than a filing category—gives compliance and fire-protection communities a shared technical footing. Regulators already have the relevant language; what is needed is a certification pathway. The argument has real limits, documented in Section 5.4. Priority next steps are laboratory validation of the combined architecture, extension to other Class 2 cargo categories, sensor fusion algorithm development, and the certification science that any operational deployment will require.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable. This review uses only publicly available secondary sources and a literature-derived weighting. A planned extension involving an expert panel will be submitted for review by the Alanya Alaaddin Keykubat University ethics committee before any primary data collection.

Data Availability Statement

All sources are publicly accessible. The FAA incident register is at faa.gov/hazmat/resources/lithium_batteries; the ASRS database is at asrs.arc.nasa.gov; the UL TRIP report is at ulse.org. The decision matrix and pairwise comparison are reproduced in full in the Appendices.

Use of Artificial Intelligence

Artificial intelligence writing-assistance tools were used in the preparation of this manuscript. The author accepts full responsibility for the scientific content, the accuracy of all data and references, and the integrity of the final submitted work.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Weighting

The pairwise comparison matrix below was built by triangulating the published sensor-evaluation reviews and FAA detection-priority findings cited in Section 3.2.3. The Saaty 1–9 scale was used. The consistency ratio is about 0.03.
C1 C2 C3 C4 C5 C6 C7
C1 1 2 3 3 5 2 4
C2 1/2 1 2 2 4 1 3
C3 1/3 1/2 1 1 3 1/2 2
C4 1/3 1/2 1 1 3 1/2 2
C5 1/5 1/4 1/3 1/3 1 1/3 1/2
C6 1/2 1 2 2 3 1 3
C7 1/4 1/3 1/2 1/2 2 1/3 1
The eigenvector weights are w(C1) = 0.28, w(C2) = 0.18, w(C6) = 0.20, w(C3) = 0.10, w(C4) = 0.10, w(C7) = 0.10 and w(C5) = 0.04. With the principal eigenvalue near 7.24, the consistency index is about 0.04, and against the Saaty random index of 1.32 for a seven-by-seven matrix the consistency ratio is about 0.03, which passes the 0.10 test.

Appendix B. Sensitivity Detail

Each weight was varied by plus or minus twenty percent with the remaining weights rescaled to keep unit sum, and TOPSIS was recomputed. The top three order (EC, NDIR, TDLAS) held in every case. Under the safety-first scenario, with the detection-limit and response-time weights doubled, TDLAS rose to second while EC held first. Under the certification-first scenario, with the certification weight doubled, EC and NDIR strengthened and TDLAS fell to fourth. Under the cost-first scenario, with the cost weight doubled, MOX rose to third ahead of TDLAS. No scenario produced a single dominant technology, which is the basis for the combined-architecture recommendation in Section 4.3.

References

  1. Wang, Q.; Mao, B.; Stoliarov, S.I.; Sun, J. A review of lithium ion battery failure mechanisms and fire prevention strategies. Prog. Energy Combust. Sci. 2019, 73, 95–131. [Google Scholar] [CrossRef]
  2. International Air Transport Association. Lithium Battery Risk and Mitigation Guidance for Operators, 8th ed.; IATA: Montreal, QC, Canada, 2024. [Google Scholar]
  3. Federal Aviation Administration. Lithium Battery Incidents Involving Smoke, Fire, Extreme Heat or Explosion; FAA Office of Security and Hazardous Materials Safety: Washington, DC, USA, continuously updated.
  4. Federal Aviation Administration. Lithium Batteries: A Hot Topic — 2024 Incident Statistics; FAA Cleared for Takeoff, 2025.
  5. Federal Aviation Administration. On the Case: Preventing Lithium Battery Hazards. FAA Blog, August 2025. [Google Scholar]
  6. Underwriters Laboratories Standards and Engagement. Rising Incidents, Shifting Responsibility: Lithium Batteries in the Aviation Cargo Supply Chain; ULSE Insight Report; 2026, 2026. [Google Scholar]
  7. Underwriters Laboratories Standards and Engagement. Lithium-Ion Battery Incidents in Aviation: 2024 Data Review (Thermal Runaway Incident Program); ULSE: Washington, DC, USA, 2025. [Google Scholar]
  8. National Transportation Safety Board. Aircraft Accident Report AAR-13/02: UPS Flight 6, Boeing 747-44AF, Dubai, 3 September 2010; NTSB: Washington, DC, USA, 2013. [Google Scholar]
  9. Aviation and Railway Accident Investigation Board; Republic of Korea. Asiana Airlines Flight 991 Accident Investigation Report; ARAIB: Sejong, Republic of Korea, 2015. [Google Scholar]
  10. International Civil Aviation Organization. Technical Instructions for the Safe Transport of Dangerous Goods by Air, Doc 9284; ICAO: Montreal, QC, Canada, 2025–2026 ed. [Google Scholar]
  11. International Air Transport Association. Dangerous Goods Regulations, 67th ed.; IATA: Montreal, QC, Canada, 2026. [Google Scholar]
  12. Federal Aviation Administration. Lithium Battery Thermal Runaway Vent Gas Analysis, DOT/FAA/TC-15/59; FAA William J. Hughes Technical Center: Atlantic City, NJ, USA, 2016. [Google Scholar]
  13. Federal Aviation Administration. Evaluation of Lithium Battery Thermal Runaway Vent Gas Combustion, DOT/FAA/TC-22/12; FAA William J. Hughes Technical Center: Atlantic City, NJ, USA, 2023. [Google Scholar]
  14. United Nations Economic Commission for Europe. Recommendations on the Transport of Dangerous Goods: Model Regulations, 23rd rev. ed.; UN: Geneva, Switzerland, 2023. [Google Scholar]
  15. Bugryniec, P.J.; Resendiz, E.G.; Nwophoke, S.M.; Khanna, S.; James, C.; Brown, S.F. Review of gas emissions from lithium-ion battery thermal runaway failure — considering toxic and flammable compounds. J. Energy Storage 2024, 87, 111288. [Google Scholar] [CrossRef]
  16. Han, D.; Wang, J.; Yin, C.; Zhao, Y. Advances in early warning of thermal runaway in lithium-ion battery energy storage systems. Adv. Sens. Res. 2025, 4, 2400165. [Google Scholar] [CrossRef]
  17. Teng, Z.; Lv, C. Detection toward early-stage thermal runaway gases of Li-ion battery by semiconductor sensor. Front. Chem. 2025, 13, 1586903. [Google Scholar] [CrossRef] [PubMed]
  18. Zhang, J.; Li, Z.; Huang, L. A review of gas-sensitive materials for lithium-ion battery thermal runaway monitoring. Molecules 2026, 31, 347. [Google Scholar] [CrossRef] [PubMed]
  19. Song, Y.; Jiang, X.; Lyu, N.; Lu, H.; Zhang, D.; Li, H.; Jin, Y. Early warning of lithium-ion battery thermal runaway based on gas sensors. J. Energy Chem. 2026; in press. [Google Scholar]
  20. Perrow, C. Normal Accidents: Living with High-Risk Technologies, updated ed.; Princeton University Press: Princeton, NJ, USA, 1999; ISBN 978-0-691-00412-9. [Google Scholar]
  21. Weick, K.E.; Sutcliffe, K.M. Managing the Unexpected: Sustained Performance in a Complex World, 3rd ed.; Jossey-Bass/Wiley: Hoboken, NJ, USA, 2015; ISBN 978-1-118-86241-7. [Google Scholar]
  22. Tornatzky, L.G.; Fleischer, M. The Processes of Technological Innovation; Lexington Books: Lexington, MA, USA, 1990; ISBN 978-0-669-20091-0. [Google Scholar]
  23. Feng, X.; Ouyang, M.; Liu, X.; Lu, L.; Xia, Y.; He, X. Thermal runaway mechanism of lithium ion battery for electric vehicles: a review. Energy Storage Mater. 2018, 10, 246–267. [Google Scholar] [CrossRef]
  24. Feng, X.; Zheng, S.; Ren, D.; He, X.; Wang, L.; Cui, H.; et al. Investigating the thermal runaway mechanisms of lithium-ion batteries based on thermal analysis database. Appl. Energy 2019, 246, 53–64. [Google Scholar] [CrossRef]
  25. Golubkov, A.W.; Fuchs, D.; Wagner, J.; Wiltsche, H.; Stangl, C.; Fauler, G.; Voitic, G.; Thaler, A.; Hacker, V. Thermal-runaway experiments on consumer Li-ion batteries with metal-oxide and olivin-type cathodes. RSC Adv. 2014, 4, 3633–3642. [Google Scholar] [CrossRef]
  26. Golubkov, A.W.; Scheikl, S.; Planteu, R.; Voitic, G.; Wiltsche, H.; Stangl, C.; Fauler, G.; Thaler, A.; Hacker, V. Thermal runaway of commercial 18650 Li-ion batteries with LFP and NCA cathodes — impact of state of charge and overcharge. RSC Adv. 2015, 5, 57171–57186. [Google Scholar] [CrossRef]
  27. Larsson, F.; Andersson, P.; Blomqvist, P.; Mellander, B.-E. Toxic fluoride gas emissions from lithium-ion battery fires. Sci. Rep. 2017, 7, 10018. [Google Scholar] [CrossRef] [PubMed]
  28. Said, A.O.; Lee, C.; Stoliarov, S.I. Experimental investigation of cascading failure in 18650 lithium ion cell arrays: impact of cathode chemistry. J. Power Sources 2020, 446, 227347. [Google Scholar] [CrossRef]
  29. Federal Aviation Administration. 14 CFR Part 25.857: Cargo Compartment Classification; FAA: Washington, DC, USA, current ed.
  30. Federal Aviation Administration. Summary of FAA Studies Related to the Hazards Produced by Lithium Cells in Thermal Runaway in Aircraft Cargo Compartments; FAA Fire Safety Branch: Atlantic City, NJ, USA, 2018. [Google Scholar]
  31. Federal Aviation Administration. Flammability Limits of Lithium-Ion Battery Thermal Runaway Vent Gas in Air and the Inerting Effects of Halon 1301; FAA Technical Report; Atlantic City, NJ, USA, 2018. [Google Scholar]
  32. Cheng, J.; et al. Intelligent early detection of lithium-ion battery thermal runaway via H2/CO sensor arrays and signal processing algorithms. Sens. Actuators B Chem. 2026, 422, 138712. [Google Scholar]
  33. Luo, L.; Chen, J.; Hui, A.G.; Liu, R.; Zhou, Y.; Liang, H.; Wang, Z.; Luo, H.; Fang, F. Highly sensitive non-dispersive infrared gas sensor for monitoring carbon dioxide emissions from lithium-ion battery thermal runaway. Micromachines 2025, 16, 36. [Google Scholar] [CrossRef]
  34. United Nations Economic Commission for Europe. Global Technical Regulation No. 20: Electric Vehicle Safety; UNECE: Geneva, Switzerland, 2018. [Google Scholar]
  35. Federal Aviation Administration. Advisory Circular 120-80B: In-Flight Fires; FAA: Washington, DC, USA, 2014. [Google Scholar]
  36. European Union Aviation Safety Agency. Certification Specifications for Large Aeroplanes, CS-25. EASA: Cologne, Germany, current amendment.
  37. European Union Aviation Safety Agency. Annual Safety Review 2024; EASA: Cologne, Germany, 2024. [Google Scholar]
  38. National Aeronautics and Space Administration. Aviation Safety Reporting System; NASA Ames Research Center: Moffett Field, CA, USA.
  39. International Civil Aviation Organization. Accident/Incident Data Reporting (ADREP) System; ICAO: Montreal, QC, Canada.
  40. Saaty, T.L. The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation; McGraw-Hill: New York, NY, USA, 1980; ISBN 0-07-054371-2. [Google Scholar]
  41. Saaty, T.L. Decision making with the analytic hierarchy process. Int. J. Serv. Sci. 2008, 1, 83–98. [Google Scholar] [CrossRef]
  42. Hwang, C.L.; Yoon, K. Multiple Attribute Decision Making: Methods and Applications; Lecture Notes in Economics and Mathematical Systems; Springer: Berlin/Heidelberg, Germany, 1981; Vol. 186, ISBN 978-3-540-10558-9. [Google Scholar]
  43. Behzadian, M.; Otaghsara, S.K.; Yazdani, M.; Ignatius, J. A state-of-the-art survey of TOPSIS applications. Expert Syst. Appl. 2012, 39, 13051–13069. [Google Scholar] [CrossRef]
  44. Mardani, A.; Jusoh, A.; Nor, K.M.D.; Khalifah, Z.; Zakwan, N.; Valipour, A. Multiple criteria decision-making techniques and their applications — a review 2000–2014. Econ. Res.-Ekon. Istraz. 2015, 28, 516–571. [Google Scholar] [CrossRef]
  45. Canepa, M. XAI–MCDA-HoDEM: an explainable multi-criteria decision framework for maritime and port decarbonization. Gases 2026, 6, 25. [Google Scholar] [CrossRef]
  46. European Union Aviation Safety Agency. Detection of Lithium Batteries Using Security Screening Equipment; EASA: Cologne, Germany, 2023. [Google Scholar]
  47. Palmer, M.D.; Crew, A.P.; Bell, M.J. Cycle-level evaluation of a temperature-modulated MOX digital nose for ethylene presence classification in fruit headspace. Gases 2026, 6, 21. [Google Scholar] [CrossRef]
Table 1. Principal lithium-ion vent gases mapped to the UN Recommendations on the Transport of Dangerous Goods. Mole-fraction ranges are drawn from the consolidated experimental literature [15,25,26].
Table 1. Principal lithium-ion vent gases mapped to the UN Recommendations on the Transport of Dangerous Goods. Mole-fraction ranges are drawn from the consolidated experimental literature [15,25,26].
Vent gas UN no. DG division Mole fraction Note
Hydrogen (H2) 1049 2.1 flammable 30–50% Rises with SOC; LFL 4%
Carbon monoxide (CO) 1016 2.3 toxic 10–25% OSHA PEL 50 ppm
Carbon dioxide (CO2) 1013 2.2 non-flammable 10–30% Often largest by volume
Methane (CH4) 1971 2.1 flammable 5–15% LFL 5%
Ethylene (C2H4) 1962 2.1 flammable 2–10% Marks advanced runaway
Propylene (C3H6) 1077 2.1 flammable 1–5% LFL 2.0%
Hydrogen fluoride (HF) 1052 8 (sub. 6.1) trace–0.5% Crew and avionics hazard
Table 2. Evaluation criteria for the cargo-hold sensor decision.
Table 2. Evaluation criteria for the cargo-hold sensor decision.
Code Criterion Type Scale
C1 Detection limit for the target mixture Benefit lower ppm scores higher
C2 Response time T90 Cost shorter seconds scores higher
C3 Selectivity vs humidity, kerosene vapour, CO2 Benefit target/interferent ratio
C4 Flight-envelope tolerance (−40/+70 °C; 200–1013 hPa; 0–95% RH) Benefit ordinal 1–9
C5 Power per node Cost lower mW scores higher
C6 Certification readiness (TSO, DO-160) Benefit TRL 6–9
C7 Unit and lifecycle cost over 5-year MTBF Cost lower USD scores higher
Table 3. Decision matrix from datasheet ranges and published evaluations [16,17,18,32,33]. Analysis uses range midpoints.
Table 3. Decision matrix from datasheet ranges and published evaluations [16,17,18,32,33]. Analysis uses range midpoints.
Sensor C1 LOD ppm C2 T90 s C3 Sel C4 Env C5 mW C6 TRL C7 USD
MOX 0.5–2 10–30 3–5 5 100–200 7 5–25
NDIR 5–20 20–60 7–8 8 150–300 9 100–400
EC 0.05–1 20–60 7–8 6 1–10 9 50–200
PID 0.1–1 3–10 5–6 5 200–400 7 300–800
TDLAS 0.01–0.1 1–5 9 8 1000–2000 6 5000+
Table 4. TOPSIS closeness coefficients and ranking under the Appendix A weights.
Table 4. TOPSIS closeness coefficients and ranking under the Appendix A weights.
Sensor D+ D− Closeness C* Rank
EC 0.022 0.063 0.741 1
NDIR 0.031 0.054 0.635 2
TDLAS 0.041 0.058 0.586 3
MOX 0.048 0.040 0.455 4
PID 0.057 0.029 0.337 5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated