Effects of Ecological Complexity on Student Identification Accuracy in a School-Based Citizen Science Program

Motti Charter

doi:10.20944/preprints202604.1008.v1

Submitted:

14 April 2026

Posted:

15 April 2026

You are already at the latest version

Abstract

School-based citizen science is increasingly used to support biodiversity monitoring, but the reliability of student-generated ecological data remains uncertain. We evalu-ated a nationwide barn owl (Tyto alba) pellet project in Israel involving 1,612 students and 107 teachers over two academic years. Teacher participation was moderate to high: 66.4% of teachers returned at least one complete kit, generating 3,333 prey rec-ords—mostly from small mammals—and demonstrating that schools can contribute significant ecological sampling effort across broad geographic areas. Student identifi-cation accuracy was moderate: 50% of pellet analyses were perfectly identified, with an overall item-level accuracy of 62%, and 76% of pellet attempts included at least one correctly identified prey item. The probability of perfect identification declined sharply with pellet complexity. Each additional prey item reduced the odds of correctly identi-fying all prey items in a pellet by about 68%, indicating that identification reliability decreases as sample complexity increases. Although many students initially found pellets disgusting, engagement remained high, and willingness to repeat the activity was strongly linked to enjoyment. These findings show that school-based citizen sci-ence can support biodiversity monitoring, but reliable ecological inference requires validation frameworks, especially when sample complexity is high.

Keywords:

citizen science

;

barn owl pellets

;

biodiversity monitoring

;

identification accuracy

;

ecological complexity

;

observer error

Subject:

Biology and Life Sciences - Ecology, Evolution, Behavior and Systematics

1. Introduction

Scaling biodiversity monitoring over large areas is a key challenge in applied ecology [1,2]. Citizen science has emerged as a key approach to addressing this challenge by involving human participants directly in ecological data collection [3]. In human–natural systems, ecological knowledge is created through interactions between observers and the environment [4,5]. Reliable monitoring of small mammal communities is particularly important for understanding biodiversity dynamics [6], ecosystem services [7,8], and agricultural pest regulation [9]. However, conventional trapping approaches are labor-intensive, costly, and spatially constrained, prompting the use of indirect dietary methods such as owl pellet analysis [10]. Owl pellets are also used to determine the composition of small mammal communities and can provide a better representation of these communities compared to estimates from conventional trapping [11,12].

Barn owls (Tyto alba) are one of the most common small mammal specialists [13], whose diet can fluctuate with changes in prey abundance, and are found throughout the world. Since barn owl pellets are easy to collect, store, and examine [14] diet studies based on them also serve as practical platforms for public participation in ecological research. However, when non-expert observers collect ecological data, the reliability of their inferences depends not only on their training and expertise but also on how human perceptual and cognitive constraints interact with ecological complexity [15,16]. Ecological inference from dietary items often requires reliable estimation of species composition, proportional abundance, and detection of rare taxa. Even moderate identification error may bias these estimates, particularly in samples containing multiple prey species. In Israel, barn owls are widely deployed as biological control agents in agricultural [17,18] to reduce rodent damage, resulting in extensive nest-box networks [19] and long-term dietary monitoring [20,21] across the country.

Citizen science has grown rapidly as a method for expanding biodiversity monitoring, biological research, and science education [22], but concerns remain regarding data accuracy, with quantitative evaluations showing that volunteer-generated data are sometimes, but not consistently, comparable to professional standards [23]. Most evaluations implicitly treat observer error as random variation or as a result of training quality. However, less attention has been paid to whether identification error arises predictably from interactions between observer capacity and the structure of ecological samples [24,25]. Within education, science education experts primarily focus on children’s learning experiences [26,27,28], and science educators often justify implementing Citizen Science in schools by citing data-quality evidence from adult-based Citizen Science projects [29].

Despite the dual scientific and educational aims of citizen science, school-based implementations are predominantly evaluated through an educational lens. Few studies have rigorously evaluated whether these programs generate data reliable enough for ecological inference under authentic classroom conditions [3,30]. Empirical evaluations show mixed results: in some cases, student measurements approximate professional data, but greater variance raises concerns about precision for ecological monitoring [31], while detailed error analyses reveal frequent procedural and identification mistakes [32]. Even in studies reporting acceptable average accuracy, it remains unclear whether identification errors systematically increase with task complexity. This uncertainty limits the ecological conclusions that can be drawn from student-generated data. These concerns extend beyond pellet analysis to any school-based activity requiring morphological or taxonomic discrimination, as similar variation in volunteer-generated data quality has been documented across citizen science contexts [33]. Therefore, distinguishing between correctable training deficiencies and performance limits caused by inherent complexity is essential for developing scalable ecological monitoring frameworks.

Teachers act as crucial mediators in shaping how ecological data are collected, since observer traits and behavior greatly impact data quality and interpretation [15,22,34]. In school-based citizen science programs, an additional concern is whether teachers can reliably follow standardized protocols and support the submission of scientifically valid data without direct supervision. Emerging research indicates that teachers’ ability to effectively facilitate school-based citizen science may be limited without structured professional development and ongoing support [35]. Teachers often report limited methodological guidance, challenges in choosing appropriate tools, and difficulties designing scientifically rigorous activities, showing that implementation fidelity cannot be assumed [36]. Additionally, case study research reveals significant variation in how teachers implement citizen science curricula, even with structured support materials, highlighting the influence of contextual and instructional factors on data quality in classroom settings [37].

The validity of biodiversity monitoring depends on the relationship between observation error and ecological complexity. When multiple taxa co-occur within samples, misclassification does not simply reduce accuracy; it can systematically distort community metrics and obscure rare species [15,38]. Understanding whether identification error scales predictably with ecological complexity is therefore central to determining whether citizen science systems can generate reliable ecological knowledge [33,39]. However, this relationship has rarely been quantified in school-based monitoring systems. In community monitoring, detection probability and misclassification error jointly determine whether species are recorded or overlooked [40]. When identification error increases with sample complexity, apparent absences may reflect observer limitations rather than ecological rarity, thereby biasing inferences about species richness and community composition. If identification error increases systematically with ecological complexity, then biases in species detection and abundance estimates may arise as inherent properties of the interaction between observers and ecological systems, rather than as correctable noise [15,24].

This study assessed a nationwide school-based citizen science program to see how ecological complexity affects student identification accuracy in real classroom settings. Specifically, we examined: (1) whether teachers are likely to complete and return standardized research kits; (2) if student prey identifications are accurate enough to support ecological diet analysis; and (3) whether increasing task complexity reduces identification success. This framework allows us to assess not only whether students participate successfully, but also how ecological complexity affects the reliability of student-generated ecological data. By analyzing how identification errors vary with pellet complexity, we determine when school-based participation can support species-level ecological inference and when expert validation becomes necessary.

2. Materials and Methods

2.1. Study Design

I implemented a nationwide school-based citizen science program in Israel during the 2018–2019 and 2019–2020 academic years to evaluate teacher participation, student engagement, and student accuracy in prey identification from barn owl pellets. The project was embedded within ongoing ecological monitoring of barn owl diet and designed to assess both feasibility and data reliability under routine classroom conditions. A total of 107 teachers from 74 middle and high schools received standardized pellet analysis kits following professional development workshops. Classroom implementation occurred without the researcher’s presence. Across both years, 1,612 students conducted pellet dissections, and 412 completed post-activity questionnaires assessing perceptions and engagement.

2.2. Teacher Recruitment and Training

Teachers were recruited through official workshops of the Israel Ministry of Education, which invited the study author to deliver pellet-analysis training as part of its professional development program. The relevant Ministry supervisors were therefore familiar with the workshop content, materials, and implementation procedures used in the project. All workshops were designed and delivered by the study author (MC), a university lecturer and researcher specializing in barn owl ecology and trophic dynamics. Participating teachers attended six-hour professional development workshops. Training included instruction on barn owl ecology, study objectives, standardized pellet dissection procedures, and prey identification using skeletal morphology and dentition keys.

Teachers independently dissected pellets during the workshops to ensure full familiarity with the protocol and the ability to implement the activity without researcher supervision. Because the workshops were conducted by an active researcher whose ecological monitoring relies on pellet analysis, teachers received direct training from a subject matter expert using the same identification framework employed in ongoing academic research. In addition to pellet identification procedures, the workshops introduced teachers to basic data compilation and analysis in Excel, including summarizing prey frequencies and generating simple descriptive statistics and graphs. Teachers were encouraged to have students conduct small classroom-level analyses of their pellet data following dissection. However, student data analysis performance was not formally measured or evaluated in the present study. Participation in the classroom phase was voluntary following training.

2.3. Research Kits and Materials

Each participating teacher received individually numbered research kits designed to standardize sampling and data recording. Kits contained 24 sterilized barn owl pellets packaged into two geographic subsets, a magnifying glass, written implementation protocols, 24 uniquely labeled specimen bags for skeletal items, prey identification keys, student questionnaires (both academic years), a teacher questionnaire (primarily 2019–2020), and prepaid return shipping materials. An Excel template for summarizing pellet data was included in the instructional materials.

Each pellet was assigned a unique identifier linking the student-recorded identifications to the returned skeletal items. Students were instructed to isolate and preserve all identifiable skeletal elements from each pellet and to record prey determinations on standardized data sheets. The owl pellets were sterilized in a hot-air oven at 160 °C (320 °F) for 2 hours to kill potential pathogens.

In addition to the identification keys, teachers received a detailed skeletal reference sheet illustrating three common rodent species, with all major cranial and mandibular elements labeled in Arabic, Hebrew, and English. The sheet was designed to standardize anatomical terminology across schools serving linguistically diverse student populations. Teachers were instructed to print and distribute the reference sheet during dissection sessions. Each kit also included printed identification guides focused on the most common rodent taxa expected in barn owl pellets from the study region, emphasizing diagnostic dental and cranial characteristics relevant for species-level identification.

2.4. Classroom Implementation

Prior to dissection, students received approximately 1 hour of structured instruction using standardized project materials, including presentation slides, instructional videos, and identification keys prepared by the study author. During dissection, pellets were hydrated and manually separated into fur and skeletal components. Students isolated skulls and mandibles and identified prey taxa primarily based on dental characteristics. All identifications were conducted within the classroom under teacher supervision. Researchers were not present during classroom implementation.

Following dissection, teachers were encouraged to guide students in compiling class-level prey data and conducting simple descriptive analyses in Excel; however, the extent and quality of this component were not systematically recorded and are therefore not included in the present analyses. After completing the activity, students completed questionnaires assessing disgust toward pellets, enjoyment of the activity (1 = suffered to 5 = enjoyed a lot), and willingness to repeat the activity (1 = no to 5 = definitely). Teachers completed a separate questionnaire regarding prior experience with classroom research and perceptions of the project.

2.5. Expert Verification and Identification Accuracy

All skeletal items and identification forms were returned to the research team. All returned items were examined and verified by the study author (MC), who routinely conducts professional pellet analyses as part of long-term barn owl dietary monitoring. Expert determinations were treated as the reference standard for prey identity. Student identifications were compared directly with expert determinations at the level of individual prey items within each pellet.

Prey identification performance was quantified at the student–pellet level using three complementary metrics: (1) perfect identification (binary; all prey items correctly identified), (2) proportional accuracy (number of correctly identified prey items divided by total prey items within the pellet), and (3) any success (binary; at least one prey item correctly identified). Pellet complexity was defined as the total number of prey items contained within each pellet and was treated as a continuous predictor of identification difficulty.

2.6. Response Variables and Covariates

Teacher participation was quantified as a binary outcome (returned at least one completed kit) and, among returning teachers, the proportion of completed kits relative to kits taken. Teacher background variables, available only for returning teachers who completed the teacher questionnaire, included prior experience conducting classroom-based scientific activities and prior collaboration with academic researchers (both binary). Student-level covariates included grade level (middle vs. high school), gender, school geographic region (central vs. peripheral), perceived disgust (binary), enjoyment (1–5 Likert scale), and willingness to repeat the activity (1–5 Likert scale).

Participation by teachers and students was voluntary, and student questionnaires were completed anonymously. No names or identifying personal information were collected from students. The project was conducted under the auspices of the Israel Ministry of Education and implemented through official Ministry workshops, with the knowledge and support of the Israeli Supervisor of Biology, Science, and Mathematics for High Schools (Dr. Irit Sadeh) and the Science and Technology Supervisor for Middle Schools in the Northern District (Mrs. Nirit Berger). Because the activity was implemented as an official Ministry of Education school-based project, no separate written parental consent procedure was applied.

2.7. Statistical Analyses

Local prey-identification difficulty (“pellet complexity”) was quantified as the total number of prey items in each pellet. For each student–pellet attempt, prey identification performance was summarized in three complementary ways: (1) overall accuracy, defined as the number of prey items correctly identified out of the total number of prey items in that pellet; (2) perfect identification, coded as whether all prey items in the pellet were identified correctly (0/1); and (3) any success, coded as whether the student identified at least one prey item correctly (0/1). Student perceptions were summarized using (i) whether pellets were perceived as disgusting (yes/no), (ii) enjoyment of the activity (1 = suffered to 5 = enjoyed a lot), and (iii) willingness to repeat the activity (1 = no to 5 = definitely). Sample sizes varied among models because analyses were restricted to records with complete responses for the relevant variables.

Teacher participation was analyzed in three steps. First, among all teachers who received kits, we modeled the probability of returning at least one completed kit using binary logistic regression with school level, school location, and year as fixed effects. Second, among teachers who returned at least one completed kit, we modeled kit completion rate (number of completed kits returned / total kits taken) using a binomial generalized linear model with a logit link, with school level, school location, and year as fixed effects. Third, among the subset of returning teachers who completed the teacher questionnaire, we modeled kit completion rate using a binomial generalized linear model with a logit link, including year, prior classroom scientific activity, and prior collaboration with academic researchers as fixed effects.

To evaluate predictors of student perceptions and prey identification success, we used generalized linear mixed models (GLMMs). Binary outcomes were analyzed with a binary logistic regression GLMM with a logit link. Overall accuracy was summarized descriptively as the proportion of prey items correctly identified within each pellet. Enjoyment and willingness ratings were analyzed using normal mixed effects models with an identity link (treating 1–5 ratings as approximately continuous). Fixed effects were selected a priori and included student sex, grade level (middle vs high school), school location (central vs peripheral), perceived disgust, enjoyment rating (where relevant), and pellet complexity (total prey items). The school’s name was included as a random intercept for student perception outcomes (enjoyment and willingness). Fixed effects in GLMMs were evaluated using F-tests as implemented in SPSS GENLINMIXED. The school’s name was included as a random intercept to address non-independence among student observations. Since each participating school was typically represented by a single science teacher responsible for implementing the protocol across classes, the school random effect likely captures both the school-level context and variation in how teachers carried out the protocol.

Collinearity among fixed effects was low (all VIF ≤ 1.47; tolerance ≥ 0.68; condition indices < 16). Residual diagnostics for linear mixed models indicated approximate normality and homoscedasticity. Model diagnostics revealed mild overdispersion (Pearson χ²/df = 1.58), which is within acceptable limits for binomial GLMMs and unlikely to affect inference. Accordingly, no additional correction was applied. SPSS Statistics 29.0.2 (IBM Corp., Armonk, NY, USA)

3. Results

3.1. Student Prey Identification Accuracy and Success

Across all pellets (N = 3,333 prey items), the prey assemblage was dominated by Levant vole (Microtus guentheri; 49.4%, n = 1,646) and house mouse (Mus musculus; 32.9%, n = 1,098), followed by shrews (Crocidura spp.; 8.6%, n = 288) and Tristram’s jird (Meriones tristrami; 6.8%, n = 225). Less frequent prey included black rat (Rattus rattus; 1.2%, n = 41), birds (0.5%, n = 16), mole rat (0.03%, n = 1), other prey (0.1%, n = 4), and unrecognizable items (0.4%, n = 14). Across 1,612 student–pellet attempts, 50.0% resulted in perfect identification (i.e., all prey items in the pellet were identified correctly). Across all prey items pooled across attempts, students correctly identified 62.0% of individual prey items. In 76% of student pellet attempts, students correctly identified at least one prey item. Using a binomial GLMM with a logit link and school name included as a random intercept, the probability of perfect identification decreased strongly as the number of prey items per pellet increased (β = −1.13, 95% CI −1.27 to −0.99; F_1,1610 = 259.83, p < 0.001). Each additional prey item reduced the odds of perfect identification by approximately 68% (Exp(β) = 0.32, 95% CI 0.28–0.37) (Figure 1). Consistent with this effect, predicted probabilities (from the fitted model) decreased from approximately 85% for pellets containing a single prey item to ~50% for two prey items, ~30% for three prey items, and <10% for pellets containing five or more prey items, indicating a rapid loss of identification reliability as sample complexity increased. Pellets containing more prey items were, therefore, substantially more challenging for students to identify perfectly.

3.2. Teacher Participation Results

Of the 107 teachers who received kits, 71 (66.4%) returned at least one completed kit. Return rates were 74.5% (35/47) in 2018–2019 and 60.0% (36/60) in 2019–2020, but this difference was not statistically significant (χ²₁ = 2.47, p = 0.12). Among non-returning teachers (n=36), 22 returned kits unused and 14 did not return kits at all despite prepaid shipping. Among returners (n = 71), teachers took 1–5 kits (median = 1), and most completed all kits taken (55/71; 77.5%). The probability of returning at least one completed kit did not differ between school years (Exp(β) = 0.81, 95% CI 0.31–2.16; Wald χ²₁ = 0.17, p = 0.678). In a logistic regression including school level, school location, and year as fixed effects, the overall model was not significant (χ²₃ = 5.96, p = 0.114). School level showed a non-significant trend, with high-school teachers having lower odds of returning a kit compared to middle-school teachers (Exp(β) = 0.40, 95% CI 0.15–1.10; Wald χ²₁ = 3.13, p = 0.077). School location was not associated with return probability (Exp(β) = 1.04, 95% CI 0.43–2.52; p = 0.931).

Among teachers who returned at least one completed kit (returners; n = 71), the mean completion rate was 84.4% (98 completed kits returned out of 116 kits taken). Using a binomial generalized linear model with a logit link (response = number of completed kits returned / total kits taken), kit completion rate was not significantly associated with school location (β = −0.45, 95% CI −1.51 to 0.62; F_1,67 = 0.70, p = 0.405), school level (β = 0.31, 95% CI −0.72 to 1.34; F_1,67 = 0.36, p = 0.548), or year (β = 0.65, 95% CI −0.30 to 1.60; F_1,67 = 1.87, p = 0.176). The overall fixed-effects model was not significant (F_3,67 = 0.95, p = 0.424). Because this analysis excludes non-returning teachers, predictors of completion rate should be interpreted as applying only to teachers who participated at least once.

Most returners also completed a brief questionnaire (59/71; 83.1%). Of those responding, 50.0% reported conducting nature-based scientific activities with students prior to this project (29/58), 20.7% had previously conducted research activities with academic researchers (12/58), and 66.1% perceived a difference between academic research and active learning research (39/59). Teachers rated all proposed project goals highly (means 4.10–4.78 on a 1–5 scale), with the highest ratings for exploring nature/environment (mean = 4.78) and developing student curiosity (mean = 4.71).

Using a binomial generalized linear model with a logit link (response = number of completed kits returned / total kits taken), including year and teacher prior-experience variables as fixed effects, the overall model was not statistically significant (F_3,53 = 2.08, p = 0.114). Completion rate did not differ by year (β = 0.53, 95% CI −0.57 to 1.62; F_1,53 = 0.93, p = 0.339) or by prior classroom research collaboration with academic researchers (β = 0.40, 95% CI −0.65 to 1.46; F_1,53 = 0.58, p = 0.449). However, teachers who had not previously conducted scientific activities in the classroom showed significantly lower completion rates compared to those with prior experience (β = −1.17, 95% CI −2.21 to −0.13; F_1,53 = 5.12, p = 0.028), corresponding to approximately 69% lower odds of completing kits (Exp(β) = 0.31, 95% CI 0.11–0.88).

3.3. Student Perceptions and Engagement During the Pellet Activity

In the study, 74.9% (N = 394) of students reported that the thought of pellets was disgusting. Using a binary logistic regression GLMM with a logit link and school name included as a random intercept, students’ perception of pellets as disgusting was not significantly associated with grade level (β = −1.16, 95% CI −2.41 to 0.09, F_1,367 = 3.33, p = 0.069), school location (β = 0.65, 95% CI −0.54 to 1.84, F_1,367 = 1.15, p = 0.284), or gender (β = −0.10, 95% CI −0.61 to 0.41, F_1,367 = 0.15, p = 0.697).

Students rated their enjoyment of the pellet activity relatively highly (mean = 3.88, SE= 0.06, N = 412; 1 = suffered, 5 = enjoyed a lot). Using a normal mixed effects model with an identity link and school name included as a random intercept, students who reported that pellets were disgusting had lower enjoyment scores than students who did not report disgust (β = -1.06, 95% CI -1.32 to -0.79, F_1,362= 62.72, p < 0.001; Figure 2). Enjoyment was not significantly related to grade level (β = 0.35, 95% CI –0.24 to 0.94, F_1,362= 1.36, p = 0.25), school location (β = -0.13, 95% CI –0.67 to 0.41, F_1,362= 0.22, p = 0.64), or gender (β = -0.18, 95% CI –0.41 to 0.06, F_1,362= 2.25, p = 0.14).

When asked whether they would want to analyze pellets again, students reported moderate to high willingness (mean = 3.56, SE = 0.07, N = 408; 1 = no, 5 = definitely). Using a normal mixed effects model with an identity link and school name included as a random intercept, willingness to repeat the activity differed strongly by enjoyment rating (F_4,358 = 123.11, p < 0.001). Students who reported lower enjoyment (scores 1–4) were less willing to participate again than those who reported the highest enjoyment (score 5). For example, students who rated enjoyment as 1 had substantially lower willingness than those rating enjoyment as 5 (β = −3.33, 95% CI −3.70 to −2.96; p < 0.001; Figure 3a). Students who perceived pellets as disgusting were also less willing to repeat the activity than those who did not report disgust (β = -0.44, 95% CI -0.67 to -0.22, F_1,358 = 15.56, p <0.001; Figure 3b). In addition, boys reported slightly lower willingness than girls (β = -0.23, 95% CI –0.41 to -0.05, F_1,358 = 6.34, p = 0.012; Figure 3c). Willingness was not related to grade level (β = 0.11, 95% CI –0.19 to 0.41, F_1,358 = 0.53, p = 0.47) or school location (β = 0.03, 95% CI –0.25 to 0.30, F_1,358 = 0.04, p = 0.85).

4. Discussion

4.1. Identification Accuracy and Limits to Ecological Inference

Student prey-identification accuracy was moderate. Across all prey items, 62% were correctly identified, and half of the student pellet attempts resulted in perfect identification. Most students correctly identified at least one prey item per pellet, indicating partial task mastery. However, ecological inference, especially regarding species-level diet composition, proportional abundance estimation, and rare-species detection, demands higher reliability than just partial success [11]. Similar patterns of training-dependent identification accuracy have been documented in other citizen science contexts, where volunteer performance improves with structured training but remains sensitive to task difficulty [41].

Even in a relatively simple prey community dominated by two rodent taxa, overall identification accuracy was not high enough to assume reliable species-level estimates without expert verification. At this level of error, misidentifications may distort proportional diet estimates, suppress or inflate apparent species richness, and obscure rare taxa [23,33]. While coarse grouping of dominant prey categories may remain broadly informative, fine-resolution ecological conclusions derived solely from student identifications would be vulnerable to bias. Whole-pellet identification reliability declined systematically with pellet complexity. This pattern shows that data quality in school-based citizen science depends not only on observer training but also on the interaction between human participants and the ecological complexity of what they interpret.

4.2. Implementation Feasibility in School-Based Monitoring

This study shows that school-based participation can generate substantial sampling effort for ecological monitoring in real classroom settings. About two-thirds of teachers who received kits returned at least one complete set of materials, and most teachers who returned kits completed all the kits they took. These participation rates are encouraging given that implementation occurred without researcher supervision and within existing curriculum constraints. However, one-third of teachers did not return usable materials, indicating that non-participation remains an important practical limitation when scaling school-based monitoring. Return probability did not differ significantly by year, school level, or school location in the multivariable model, although there was a non-significant tendency for high school teachers to be less likely than middle school teachers to return a completed kit. Among returning teachers, kit completion rate was also not significantly associated with year, school level, or school location. These results indicate that implementation was broadly feasible across school contexts, although teacher follow-through was variable [42,43].

Among the subset of returning teachers who completed the background questionnaire, prior collaboration with academic researchers was not associated with completion rate, whereas teachers without prior experience conducting scientific activities in the classroom had lower completion rates than those with such experience. This suggests that successful implementation may depend less on formal ties to academic research and more on teachers’ familiarity and confidence with classroom-based scientific activities [36,44,45]. For future school-based monitoring programs, support should focus not only on the scientific protocol but also on enhancing teachers’ practical ability to conduct hands-on inquiry in classrooms [36,44,45].

Despite initial disgust responses, enjoyment remained high, and willingness to repeat the activity was driven mainly by positive experiences. Citizen science projects can improve participants’ understanding of science and increase engagement when carefully designed to include authentic research activities [46,47]. Affective responses, therefore, did not stop large-scale participation. From a monitoring standpoint, these results show that schools can be effective partners in expanding sample-processing capacity, but participation dropout needs to be planned for when designing large-scale sampling networks.

4.3. Task Complexity as a Structural Constraint on Citizen-Science Data Quality

The probability of perfect identification declined sharply as pellet complexity increased, showing that whole-pellet identification reliability decreases as task complexity increases. Increased training and standardized protocols did not fully overcome these limitations, and similar patterns of persistent misclassification have been reported in other citizen science studies [41,48]. Consistent with our findings, pellets containing multiple prey items, common in productive agricultural systems [6], are more likely to result in incomplete or incorrect identifications when processed by non-specialists, particularly as task complexity increases [48]. As prey richness increases, estimates of diet diversity based on student data may become increasingly unreliable. Rare species may be particularly vulnerable because even a few misclassifications can result in non-detection [49]. Monitoring programs relying solely on unverified student identifications may therefore underestimate diversity or misrepresent species’ relative abundance in more complex samples [50].

4.4. Limitations and Future Directions

Several limitations should be acknowledged. First, identification accuracy was assessed in classroom settings without direct researcher supervision, and differences among teachers in instructional emphasis may have influenced performance variations that our models did not fully account for [51]. Second, we measured overall accuracy but did not differentiate error types (e.g., systematic misclassification among specific taxa), which could help improve estimates of ecological bias [52]. Third, student-level independence in classrooms might not have been fully upheld, as peer discussion during identification could have influenced results. Finally, our conclusions are based on a prey community dominated by a small number of rodent taxa; identification performance might vary in regions with higher taxonomic diversity or more morphologically similar species, where volunteer-based identification has been shown to miss a substantial proportion of taxa [53].

One potential avenue for improving reliability while maintaining broad participation is the development of an AI-based smartphone application capable of automated species-level identification of skeletal items. Automated species recognition systems are already widely used in ecological research, including deep-learning approaches for camera-trap image classification and plant identification [54,55,56], demonstrating the feasibility of computer-vision applications in biodiversity monitoring. However, any such system would require training on large expert-labeled datasets and independent validation against expert evaluations before it could be used reliably in school-based monitoring [57]. When combined with structured validation frameworks and school-based participation, this approach can both expand monitoring capacity and offer meaningful scientific engagement for students [34,58]. Since owl pellets are already distributed commercially to schools for anatomical study [59], integrating verified geospatial metadata into these supply chains could turn an existing educational product into a distributed ecological monitoring resource. Georeferenced pellet sourcing would facilitate repeated sampling across regions and involve students in authentic data collection.

Future research should focus on quantifying taxon-specific misclassification patterns [41], assessing the performance of structured digital identification tools [56], and testing tiered validation frameworks that combine wide educational participation with targeted expert oversight [33]. Such efforts would help determine the conditions under which school-based pellet analysis can reliably contribute to long-term ecological monitoring.

5. Conclusions

School-based pellet analysis can generate substantial sampling effort and broad geographic coverage, but identification reliability declines as ecological complexity increases. This study shows that the reliability of student-generated ecological data depends on the complexity of the natural material being interpreted. Even with standardized training and expert-designed protocols, increased task difficulty can impede accurate species-level identifications from student-generated data. Participation alone does not ensure reliable ecological inference. School-based citizen science programs should incorporate validation frameworks, particularly when sample complexity is high, to ensure data quality for biodiversity monitoring.

Funding

This research was funded by United States Embassy in Jerusalem, grant number SIS70020GR0133.

Institutional Review Board Statement

Ethical review and approval were waived for this study because it was implemented as an official school-based educational activity through Israel Ministry of Education workshops; participation was voluntary; student questionnaires were anonymous; no names or other identifying personal information was collected; and the study involved no medical, clinical, or invasive procedures.

Informed Consent Statement

Informed consent was waived because the questionnaire component was anonymous, voluntary, and non-invasive; no identifying personal information was collected; and the activity was implemented as part of an official school-based educational program.

Data Availability Statement

The data presented in this study are not publicly available due to privacy and ethical restrictions associated with anonymous questionnaire data collected from students in a school-based setting.

Acknowledgments

We thank Dana Klen for her help with the project. We are grateful to the Israeli Supervisor of Biology, Science, and Mathematics for High Schools (Dr. Irit Sadeh) and to the Science and Technology Supervisor for Middle Schools in the Northern District (Mrs. Nirit Berger), both from the Israel Ministry of Education, for providing access to teacher workshops and supporting the implementation of the activity. Special thanks to Ellen Schnitzer from the U.S. Embassy for her assistance and support throughout the project.

Conflicts of Interest

The author declares no conflicts of interest.

References

Anderson, C.B. Biodiversity Monitoring, Earth Observations and the Ecology of Scale. Ecol. Lett. 2018, 21, 1572–1585. [Google Scholar] [CrossRef]
Jetz, W.; McGeoch, M.A.; Guralnick, R.; Ferrier, S.; Beck, J.; Costello, M.J.; Fernandez, M.; Geller, G.N.; Keil, P.; Merow, C.; et al. Essential Biodiversity Variables for Mapping and Monitoring Species Populations. Nat. Ecol. Evol. 2019 34 2019, 3, 539–551. [Google Scholar] [CrossRef]
Chandler, M.; See, L.; Copas, K.; Bonde, A.M.Z.; López, B.C.; Danielsen, F.; Legind, J.K.; Masinde, S.; Miller-Rushing, A.J.; Newman, G.; et al. Contribution of Citizen Science towards International Biodiversity Monitoring. Biol. Conserv. 2017, 213, 280–294. [Google Scholar] [CrossRef]
Di Cecco, G.J.; Barve, V.; Belitz, M.W.; Stucky, B.J.; Guralnick, R.P.; Hurlbert, A.H. Observing the Observers: How Participants Contribute Data to INaturalist and Implications for Biodiversity Science. Bioscience 2021, 71, 1179–1188. [Google Scholar] [CrossRef]
Bowler, D.E.; Bhandari, N.; Repke, L.; Beuthner, C.; Callaghan, C.T.; Eichenberg, D.; Henle, K.; Klenke, R.; Richter, A.; Jansen, F.; et al. Decision-Making of Citizen Scientists When Recording Species Observations. Sci. Reports 2022 121 2022, 12, 11069. [Google Scholar] [CrossRef] [PubMed]
Balestrieri, A.; Gazzola, A.; Formenton, G.; Canova, L. Long-Term Impact of Agricultural Practices on the Diversity of Small Mammal Communities: A Case Study Based on Owl Pellets. Environ. Monit. Assess. 2019, 191, 725. [Google Scholar] [CrossRef]
Williams, S.T.; Maree, N.; Taylor, P.; Belmain, S.R.; Keith, M.; Swanepoel, L.H. Predation by Small Mammalian Carnivores in Rural Agro-Ecosystems: An Undervalued Ecosystem Service? Ecosyst. Serv. 2018, 30, 362–371. [Google Scholar] [CrossRef]
Hurst, Z.M.; McCleery, R.A.; Collier, B.A.; Silvy, N.J.; Taylor, P.J.; Monadjem, A. Linking Changes in Small Mammal Communities to Ecosystem Functions in an Agricultural Landscape. Mamm. Biol. 2014, 79, 17–23. [Google Scholar] [CrossRef]
Labuschagne, L.; Swanepoel, L.H.; Taylor, P.J.; Belmain, S.R.; Keith, M. Are Avian Predators Effective Biological Control Agents for Rodent Pest Management in Agricultural Systems? Biol. Control 2016, 101, 94–102. [Google Scholar] [CrossRef]
Cleary, K.A.; Bonaiuto, V.; Amulike, B.; Pearson, J.; Johnson, G. A Reduced Labor, Non-Invasive Method for Characterizing Small Mammal Communities. Mammal Res. 2025 701 2025, 70, 151–158. [Google Scholar] [CrossRef]
Heisler, L.M.; Somers, C.M.; Poulin, R.G. Owl Pellets: A More Effective Alternative to Conventional Trapping for Broad-Scale Studies of Small Mammal Communities. Methods Ecol. Evol. 2016, 7, 96–103. [Google Scholar] [CrossRef]
Torre, I.; Arrizabalaga, A.; Flaquer, C. Three Methods for Assessing Richness and Composition of Small Mammal Communities. J. Mammal. 2004, 85, 524–530. [Google Scholar] [CrossRef]
Taylor, I. Barn Owls: Predator-Prey Relationships and Conservation; Cambridge University Press, 1994. [Google Scholar]
Raczyński, J.; Ruprecht, A.L. Effect of Digestion on the Osteological Composition of Owl Pellets. Acta Ornithol. 1974, 14, 25–38. [Google Scholar]
Johnston, A.; Fink, D.; Hochachka, W.M.; Kelling, S. Estimates of Observer Expertise Improve Species Distributions from Citizen Science Data. Methods Ecol. Evol. 2018, 9, 88–97. [Google Scholar] [CrossRef]
Gaston, K.J.; Soga, M.; Duffy, J.P.; Garrett, J.K.; Gaston, S.; Cox, D.T.C. Personalised Ecology. Trends Ecol. Evol. 2018, 33, 916–925. [Google Scholar] [CrossRef]
Peleg, O.; Nir, S.; Meyrom, K.; Aviel, S.; Roulin, A.; Izhaki, I.; Leshem, Y.; Charter, M. Three Decades of Satisfied Israeli Farmers : Barn Owls (Tyto Alba ) as Biological Pest Control of Rodents. In Proceedings of the Proc. 28 th Vertebr. Pest Conf.; Woods, D.M., Ed.; Univ. of Calif.: Davis., 2018; pp. 194–203. [Google Scholar]
Ronen, N.; Brook, A.; Charter, M. Assessing Birds of Prey as Biological Pest Control: A Comparative Study with Hunting Perches and Rodenticides on Rodent Activity and Crop Health. Biology (Basel). 2025, 14, 1108. [Google Scholar] [CrossRef]
Charter, M.; Leshem, Y.; Meyrom, K.; Peleg, O.; Roulin, A. The Importance of Micro-Habitat in the Breeding of Barn Owls Tyto Alba. Bird Study 2012, 59, 368–371. [Google Scholar] [CrossRef]
Charter, M.; Izhaki, I.; Leshem, Y.; Meyrom, K.; Roulin, A. Relationship between Diet and Reproductive Success in the Israeli Barn Owl. J. Arid Environ. 2015, 122, 59–63. [Google Scholar] [CrossRef]
Charter, M.; Izhaki, I.; Roulin, A. The Relationship between Intra–Guild Diet Overlap and Breeding in Owls in Israel. Popul. Ecol. 2018, 60, 397–403. [Google Scholar] [CrossRef]
Bonney, R.; Cooper, C.B.; Dickinson, J.; Kelling, S.; Phillips, T.; Rosenberg, K. V.; Shirk, J. Citizen Science: A Developing Tool for Expanding Science Knowledge and Scientific Literacy. Bioscience 2009, 59, 977–984. [Google Scholar] [CrossRef]
Aceves-Bueno, E.; Adeleye, A.S.; Feraud, M.; Huang, Y.; Tao, M.; Yang, Y.; Anderson, S.E. The Accuracy of Citizen Science Data: A Quantitative Review. Bull. Ecol. Soc. Am. 2017, 98, 278–290. [Google Scholar] [CrossRef]
Schmidt, B.R.; Cruickshank, S.S.; Bühler, C.; Bergamini, A. Observers Are a Key Source of Detection Heterogeneity and Biased Occupancy Estimates in Species Monitoring. Biol. Conserv. 2023, 283, 110102. [Google Scholar] [CrossRef]
Farmer, R.G.; Leonard, M.L.; Horn, A.G. Observer Effects and Avian-Call-Count Survey Quality: Rare-Species Biases and Overconfidence. Auk 2012, 129, 76–86. [Google Scholar] [CrossRef]
Lüsse, M.; Brockhage, F.; Beeken, M.; Pietzner, V. Citizen Science and Its Potential for Science Education. Int. J. Sci. Educ. 2022, 44, 1120–1142. [Google Scholar] [CrossRef]
Kelemen-Finan, J.; Scheuch, M.; Winter, S. Contributions from Citizen Science to Science Education: An Examination of a Biodiversity Citizen Science Project with Schools in Central Europe. Int. J. Sci. Educ. 2018, 40, 2078–2098. [Google Scholar] [CrossRef]
Shah, H.R.; Martinez, L.R. Current Approaches in Implementing Citizen Science in the Classroom. J. Microbiol. Biol. Educ. 2016, 17, 17–22. [Google Scholar] [CrossRef]
Finger, L.; van den Bogaert, V.; Schmidt, L.; Fleischer, J.; Stadtler, M.; Sommer, K.; Wirth, J. The Science of Citizen Science: A Systematic Literature Review on Educational and Scientific Outcomes. Front. Educ. 2023, 8, 1226529. [Google Scholar] [CrossRef]
Ostrom, E. A General Framework for Analyzing Sustainability of Social-Ecological Systems. Science (80-. ). 2009, 325, 419–422. [Google Scholar] [CrossRef]
Brown, E.; Liu, H.-L. Citizen Science in the Classroom: Data Quality and Student Engagement. J. Community Engagem. Scholarsh. 2024, 16, 8. [Google Scholar] [CrossRef]
Philippoff, J.; Baumgartner, E. Addressing Common Student Technical Errors in Field Data Collection: An Analysis of a Citizen-Science Monitoring Project. J. Microbiol. Biol. Educ. 2016, 17, 51–55. [Google Scholar] [CrossRef]
Kosmala, M.; Wiggins, A.; Swanson, A.; Simmons, B. Assessing Data Quality in Citizen Science. Front. Ecol. Environ. 2016, 14, 551–560. [Google Scholar] [CrossRef]
Shirk, J.L.; Ballard, H.L.; Wilderman, C.C.; Phillips, T.; Wiggins, A.; Jordan, R.; McCallie, E.; Minarchek, M.; Lewenstein, B. V.; Krasny, M.E.; et al. Public Participation in Scientific Research: A Framework for Deliberate Design. Ecol. Soc. 2012, 17. [Google Scholar] [CrossRef]
Meza-Torres, C.; Jordan, M.; Zuiker, S.; Jongewaard, R.; Adeloju, E.; Spreitzer, K. Examining Teacher Supports for Visibility, Believability, and Meaningfulness in Place-Based Citizen Science. Proc. Int. Conf. Learn. Sci. ICLS 2024, 2409–2410. [Google Scholar] [CrossRef]
Aristeidou, M.; Lorke, J.; Ismail, N. Citizen Science: Schoolteachers’ Motivation, Experiences, and Recommendations. Int. J. Sci. Math. Educ. 2022 217 2022, 21, 2067–2093. [Google Scholar] [CrossRef]
Carrier, S.J.; Scharen, D.R.; Hayes, M.; Smith, P.S.; Bruce, A.; Craven, L. Citizen Science in Elementary Classrooms: A Tale of Two Teachers. Front. Educ. 2024, 9, 1470070. [Google Scholar] [CrossRef]
Chambert, T.; Miller, D.A.W.; Nichols, J.D. Modeling False Positive Detections in Species Occurrence Data under Different Study Designs. Ecology 2015, 96, 332–339. [Google Scholar] [CrossRef] [PubMed]
Dickinson, J.L.; Zuckerberg, B.; Bonter, D.N. Citizen Science as an Ecological Research Tool: Challenges and Benefits. Annu. Rev. Ecol. Evol. Syst. 2010, 41, 149–172. [Google Scholar] [CrossRef]
Dorazio, R.M.; Royle, J.A. Estimating Size and Composition of Biological Communities by Modeling the Occurrence of Species. J. Am. Stat. Assoc. 2005, 100, 389–398. [Google Scholar] [CrossRef]
Ratnieks, F.L.W.; Schrell, F.; Sheppard, R.C.; Brown, E.; Bristow, O.E.; Garbuzov, M. Data Reliability in Citizen Science: Learning Curve and the Effects of Training Method, Volunteer Background and Experience on Identification Accuracy of Insects Visiting Ivy Flowers. Methods Ecol. Evol. 2016, 7, 1226–1235. [Google Scholar] [CrossRef]
National Academies of Sciences, Engineering, and Medicine; Division of Behavioral and Social Sciences and Education; Board on Science Education; Committee on Designing Citizen Science to Support Science Learning. Processes of Learning and Learning in Science. In Learning Through Citizen Science: Enhancing Opportunities by Design; Pandya, R., Dibner, K.A., Eds.; National Academies Press: Washington, DC, 2018; ISBN 9780309479165. [Google Scholar]
Trautmann, N.M.; Shirk, J.L.; Krasny, M.E. Who Poses the Question? Using Citizen Science to Help K–12 Teachers Meet the Mandate for Inquiry. In Citizen Science; Louv, R., Fitzpatrick, J.W., Eds.; Cornell University Press, 2017; pp. 179–190. [Google Scholar]
Bopardikar, A.; Bernstein, D.; McKenney, S. Boundary Crossing in Student-Teacher-Scientist-Partnerships: Designer Considerations and Methods to Integrate Citizen Science with School Science. Instr. Sci. 2022 515 2023, 51, 847–886. [Google Scholar] [CrossRef]
Braz Sousa, L.; Kenneally, C.; Golumbic, Y.; Martin, J.M.; Preston, C.; Rutledge, P.; Motion, A. Teacher Experiences and Understanding of Citizen Science in Australian Classrooms. PLoS One 2024, 19, e0312680. [Google Scholar] [CrossRef]
Bonney, R.; Phillips, T.B.; Ballard, H.L.; Enck, J.W. Can Citizen Science Enhance Public Understanding of Science? Public Underst. Sci. 2016, 25, 2–16. [Google Scholar] [CrossRef]
Phillips, T.; Porticella, N.; Constas, M.; Bonney, R. A Framework for Articulating and Measuring Individual Learning Outcomes from Participation in Citizen Science. Citiz. Sci. Theory Pract. 2018, 3, 3. [Google Scholar] [CrossRef]
Lovell, S.; Hamer, M.; Rob, A.E.; Ae, S.; Herbert, D.; Lovell, S.; Hamer, Á.M.; Slotow, Á.R.; Herbert, Á.D.; Herbert, D. An Assessment of the Use of Volunteers for Terrestrial Invertebrate Biodiversity Surveys. Biodivers. Conserv. 2009 1812 2009, 18, 3295–3307. [Google Scholar] [CrossRef]
Isaac, N.J.B.; van Strien, A.J.; August, T.A.; de Zeeuw, M.P.; Roy, D.B. Statistics for Citizen Science: Extracting Signals of Change from Noisy Ecological Data. Methods Ecol. Evol. 2014, 5, 1052–1060. [Google Scholar] [CrossRef]
Lukyanenko, R.; Parsons, J.; Wiersma, Y.F. Emerging Problems of Data Quality in Citizen Science. Conserv. Biol. 2016, 30, 447–449. [Google Scholar] [CrossRef] [PubMed]
Crall, A.W.; Newman, G.J.; Stohlgren, T.J.; Holfelder, K.A.; Graham, J.; Waller, D.M. Assessing Citizen Science Data Quality: An Invasive Species Case Study. Conserv. Lett. 2011, 4, 433–442. [Google Scholar] [CrossRef]
Bird, T.J.; Bates, A.E.; Lefcheck, J.S.; Hill, N.A.; Thomson, R.J.; Edgar, G.J.; Stuart-Smith, R.D.; Wotherspoon, S.; Krkosek, M.; Stuart-Smith, J.F.; et al. Statistical Solutions for Error and Bias in Global Citizen Science Datasets. Biol. Conserv. 2014, 173, 144–154. [Google Scholar] [CrossRef]
Kremen, C.; Ullman, K.S.; Thorp, R.W. Evaluating the Quality of Citizen-Scientist Data on Pollinator Communities. Conserv. Biol. 2011, 25, 607–617. [Google Scholar] [CrossRef]
Norouzzadeh, M.S.; Nguyen, A.; Kosmala, M.; Swanson, A.; Palmer, M.S.; Packer, C.; Clune, J. Automatically Identifying, Counting, and Describing Wild Animals in Camera-Trap Images with Deep Learning. Proc. Natl. Acad. Sci. U. S. A. 2018, 115, E5716–E5725. [Google Scholar] [CrossRef]
Christin, S.; Hervet, É.; Lecomte, N. Applications for Deep Learning in Ecology. Methods Ecol. Evol. 2019, 10, 1632–1644. [Google Scholar] [CrossRef]
Willi, M.; Pitman, R.T.; Cardoso, A.W.; Locke, C.; Swanson, A.; Boyer, A.; Veldthuis, M.; Fortson, L. Identifying Animal Species in Camera Trap Images Using Deep Learning and Citizen Science. Methods Ecol. Evol. 2019, 10, 80–91. [Google Scholar] [CrossRef]
Tabak, M.A.; Norouzzadeh, M.S.; Wolfson, D.W.; Sweeney, S.J.; Vercauteren, K.C.; Snow, N.P.; Halseth, J.M.; Di Salvo, P.A.; Lewis, J.S.; White, M.D.; et al. Machine Learning to Classify Animal Species in Camera Trap Images: Applications in Ecology. Methods Ecol. Evol. 2019, 10, 585–590. [Google Scholar] [CrossRef]
Zoellick, B.; Nelson, S.J.; Schauffler, M. Participatory Science and Education: Bringing Both Views into Focus. Front. Ecol. Environ. 2012, 10, 310–313. [Google Scholar] [CrossRef]
Brustenga, L.; Massetti, S.; Paletta, C.; Piccioni, E.; Di Seclì, G.; La Porta, G.; Lucentini, L. Shaping Young Naturalists, Owl Pellets Dissection to Train High-School Students in Comparative Anatomy and Molecular Biology. J. Biol. Educ. 2025, 59, 731–744. [Google Scholar] [CrossRef]

Figure 1. Probability of perfect prey identification (all prey items correctly identified) as a function of pellet complexity (total prey items per pellet). Points represent observed proportions (±95% CI). The solid line shows predictions from a binomial GLMM with a logit link and school as a random intercept. The “6+” category pools pellets containing ≥6 prey items due to low sample sizes (n = 628, 500, 224, 102, 29, and 6 for prey items 1–5 and 6+, respectively).

Figure 2. Enjoyment (1 = suffered, 5 = enjoyed a lot) of owl pellet research activity by perceived disgust (yes vs no).

Figure 3. Willingness (1 = no, 5 = definitely) to analyze owl pellets again as a function of a) enjoyment (1 = suffered, 5 = enjoyed a lot) of the pellet activity, b) by perceived disgust, and c) by gender.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.