Submitted:
03 February 2026
Posted:
03 February 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background
2.1. Methodologies for Describing Complex Software Systems
2.2. Methods for Eliciting Requirements
2.3. Teaching Software Maintenance
- Apply software maintenance fundamentals, including terminology; the nature of and need for maintenance; maintenance costs; evolution and categories of maintenance.
- Incorporate key issues in software maintenance, to include technical issues; management issues; cost estimation; and software maintenance measurement.
- Utilize the best practices maintenance process.
- Exercise best practices techniques for maintenance
3. The Case Study of AMMBER



4. The Evolution of AMMBER
4.1. Issues That Influenced the State of the Software
- It is a characteristic of student projects that they are not thoroughly tested. The timing of the assessment mitigates against testing, and there isn’t the opportunity to exhaustively test features. Units of software testing taken by students during their degrees tend to be on the theoretical side which mitigates against a strong focus on testing. In general there is a lack of a testing culture among software engineering students in our experience. There was a lack of thorough testing of the features. For AMMBER the net effect was that there were many bugs that needed to be fixed and features to be fine-tuned.
-
The code base was disorganised. Since AMMBER was developed and maintained by multiple developers, coding standards were not applied consistently. The code quality was uneven, with some parts being better structured than others. In addition, coding styles varied across files; for example, some files use four-space indentation, while others use two-space indentation.There was also some commented-out code left by previous developers, which makes it confusing for future developers to determine whether this code should be retained or removed. Furthermore, some variables had unclear or poorly chosen names, making their purpose and usage difficult to understand.
- The repository was messy with too many branches. Each student project started with forking a new branch. Because the changes were not properly integrated, it was unclear what branches should be eliminated. The repository has still not been cleaned up adequately.
- The documentation was out of date. An intern built a manual in 2019, which was potentially a useful resource. However none of the enhancements touched the manual because it wasn’t clear what would be integrated, as a result there was inadequate documentation of new features. Ideally once a feature is finished, tested and integrated, it should be updated in the manual. The manual was not actually updated until 2025. It is an open question how best to describe the behaviour of the features. The unclear status of documentation was symptomatic of not having a maintenance culture.
-
It was also unclear what diagrams would be helpful. The most notable example came from the architecture diagram. The original team that built the precursor did not provide a system architecture diagram. I asked one of the interns to produce an architecture diagram in 2020. What he drew is depicted in Figure 4.While the diagram was correct, it was not helpful for someone trying to understand the system. It was not sufficiently clear. There were many puzzling details, such as the colours chosen for the modules. Someone needed to understand the system to determine that the diagram was correct. The intern drawing the diagram had no experience in describing a system for others to gain an understanding.Ironically Figure 4 has been useful – as an example of a poor diagram. The figure has been shown to hundreds of software engineering students. No student has been able to understand the diagram without a lot of background. In general diagrams are only useful if they help with understanding the system. They should not be drawn for the sake of it. Unfortunately most students only draw diagrams because they are demanded in assignments, rather than because they will be a useful communication device. It should go without saying that part of maintenance is assembling useful diagrams.
- The architecture diagram may have been more useful if the intern was better directed. The project was driven by the first author, both as primary user of the software and source of initiating student projects. He lacked experience of maintaining software that would continually be used over a period of years. The maintenance was managed by the last author who had extensive experience with maintaining commercial software, but not with how software maintenance could be augmented by efforts from university students. The lack of experience of how best to work with students meant that there were multiple versions of code with some good additions, but which were not integrated into the deployed software. The issue was exacerbated by working over the period of COVID restrictions which limited the number of possible face-to-face meetings.
-
Many of the software requirements changed over time. An early version of the editor had the ability to upload an image, usually a copy of do, be, feel, and who lists from a whiteboard. The lists could then be more easily copied into the editor. Storing images caused some problems with the backend database. Being able to refer to an image became less useful, as we became more experienced in integrating the tool with running a do/be/feel session. It is not part of AMMBER.The biggest change of requirement was in fact changing where the models were being stored - from in the cloud to being stored locally on an individual laptop. The software was originally designed to allow groups to work on a model. That necessitated controlling access to the software and only allowing individuals or teams to see models they created. Models would be stored in the cloud and members of a project team would access the cloud and modify the model. Controlling access was unwieldy. It was necessary to build and maintain a login system, and who had access to which model. It was thought that clients and students would tinker with models as appropriate. That rarely happened for reasons of lack of ease of access, and that people didn’t often check back with models.We had multiple instances of the software running. There was one instance for research uses, and separate instances for each subject where students manipulated models. Storing files with respect to relative paths was complicated and tricky when porting from one instantiation to another.One suggested improvement to AMMBER was to allow group communication about models. At one stage requirements were gathered in one student project extension to allow a client to comment on a model, and potentially approve it. Allowing access to different people with different roles proved unwieldy. The feature was not fully implemented nor integrated into AMMBER.In retrospect, it has been easier to avoid the issue. There were problems with paths for models which is almost inevitable. It was easier to export a model which could be sent via email or via Slack and have another person upload the model. This is how AMMBER now works.Furthermore, hosting on the cloud became expensive. While some maintenance money was received from some projects, there was little external supported activity. At one stage there were three or four different versions sitting on different servers creating a different, separate maintenance problem. We also had an issue with another student project where a cloud provider did an upgrade causing the version not to run. The client did not have the expertise to fix the problem.The motivational model editor was envisaged as a web resource, where teams would work together. For reasons noted above, it was decided that models should be stored locally and re-loaded when modifying the model. Saving and exporting files had not been implemented properly in previous extensions. Storing files locally overcame a big problem where clients could not easily tinker with models. In retrospect, local versions are better.
- The initial version of the editor used MySQL for the backend database. A security issue emerged for MySQL which necessitated an upgrade which wasn’t easy to merge into the software because of our lack of a good pipeline. We switched to Postgres which was better maintained. Such issues are inevitable.
-
The final issue is software deprecation. Systems are usually built using libraries and components, which inevitably get upgraded. The updated versions may no longer work with older code. Decisions need to be made as to whether to allow an upgrade, or to stay with an older version of the software. That is an issue that students have no experience in navigating.The biggest decision of that kind for AMMBER involved the use of mxGraph for drawing shapes. mxGraph is no longer supported. Unfortunately there was no obvious supported resource to replace it. After searching we decided to stick with mxGraph as there were many people still using the components and they worked well enough and could be extended.
4.2. Why AMMBER Maintenance Improved
5. Refactoring The Codebase
5.1. Refactoring Motivational Modeller
- Two versions of the data structures used in the app, generating in effect two views of the same dataset. These two datasets were maintained separately. Each time they were modified, both views would be updated. Typically, both of the data structures would be updated inline in the code – this made testing those updates effectively impossible. There were regular bugs with the two data structures getting out of sync.
- The use of React context (useContext) was naive, exposing both sets of data structures to any code that would modify them unconstrained.
- The use of components was inconsistent. While some components had good structure and organisation, others were ad hoc. In some places the code to update the data structures was inline which made it difficult to understand and modify, let alone test.
- Concepts were insufficiently encapsulated. In one of the models, there is an isRowEmpty test which as it happens tests whether the name of the goal is empty. Instead of using that consistently in the app, the code would trim and test the string in-line losing the clarity and modifiability of using a named function.
- Deeply chained functions made it difficult to refactor because of the uncertainty as to what other functions used the functionality, eg, onKeyDown →handleKeyPress→ handleAddRow.
- Too much use of useEffect resulted in a disconnect of actions from their causes.
5.1.1. Converting to a Single Data Structure
- The use of a React context to share data among components without moderating access meant that wherever data updates were made, both views of the data needed to be updated. This made the code complex and difficult to maintain because updating logic was scattered through the codebase. It was an obvious opportunity to re-factor this using a React reducer (useReducer, effectively an event-driven state machine) which would dispatch specific events, eg, updateTextForGoalId, addGoalToTree, etc. This gave those updates to the data structure clear names and also moved the code that mutated the data structures into a single place. With the consolidation of the code, it became possible to write unit tests for the mutations and have confidence in their correctness. Another advantage of this approach was that the data structures exposed to the code became immutable , so that the only way to update them was to dispatch events to the state machine. The support for creating a reducer with these characteristics was available in the Redux Toolkit (RTK) reduxtoolkit, a powerful extension to React. While we haven’t used the rest of the functionality that provides, this has proved to be a good choice. Only a two line Typescript shim was needed to carry the types across correctly.
- When refactoring the code to use higher level operations, it was difficult to understand the code fragments and what they did: they all worked on the data in different ways.
- The React context was split into two parts: a context that shared the data values and a state machine (using useReducer) to update the state. This made a dramatic difference to the complexity of the code with enormous amounts of code being cut out of the app. It also removed a lot of dependency updates of the app (via useEffect) disappeared as well because all the data in the context would be updated each time a change was made and React would automatically update the changed components.
- This also removed a lot of the need for “prop drilling” because components in the app could send messages to the state machine directly using the dispatch method from the context, rather than needing to have functions passed to them to update the state where it was being modified in line.
5.2. Observations
6. Using the Repository Effectively
7. Lessons Learned
-
Control the repository carefullyControlling the repository was discussed at length in the previous section. Without careful management of the repository, many good changes were not adopted. Several innovations in student team projects were never incorporated because there was poor control of branches. Keeping a repository tidy does not come naturally to students. In another student project, which was extended by several student teams, there ended up being 88 branches, most of which had no changes. They should have never been left in the repository. It took an intern several days work to clear up the branches, checking that there was nothing significant in each branch to be deleted.Systematic naming was important, but it needed to be consistent. In an earlier version of AMMBER there were develop, test, and main branches for both an editor and an admin interface. Almost none of the branches were used. Setting up the branches and maintaining them was an overhead which discouraged timely incorporation of changes. Changes need to be integrated when they are completed, which has not always been students’ practice.
-
Review design decisions regularly in light of technology developmentsAs decisions about changing requirements were made, there was no discussion about implications for system architecture. The project would have benefited from a more timely review of decisions.
-
Test as a Safeguard for Codebase IntegrityIn contemporary software projects, development is rarely carried out by a single individual. Instead, multiple contributors work concurrently on a shared codebase, often performing parallel changes and frequent merges. In such settings, maintaining codebase integrity—ensuring that new changes do not unintentionally break existing functionality—becomes a critical challenge. Automated testing plays a key role in addressing this concern.In our experience, testing was often discussed late in the development process, typically during deployment, when failures caused by recent merges became visible. Although we recognised the importance of testing, it was frequently deprioritised due to the perceived complexity arising from tight dependencies within the system. Writing unit or integration tests was seen as difficult and costly, especially in a rapidly evolving codebase. We often assumed that testing could wait until "everything was done"—but a codebase is never truly finished; it continuously evolves, and that endpoint never arrives.Despite these initial hesitations, tests proved essential in detecting regressions early and in supporting collaboration among multiple developers. This experience reinforced an important lesson: a codebase must be structured to support testing from the outset. Low coupling and high cohesion not only improve maintainability but also make automated testing feasible. Without such architectural considerations, testing risks becoming an afterthought—acknowledged in principle but avoided in practice.In project subjects, virtually all students fail to test their software adequately. There are two primary reasons for this. First, students often run out of time, and by the end of the semester are scrambling to make a demonstration run. Second, rigorous software testing is generally outside students’ direct experience of writing software for subject assignments. They don’t have the appropriate mindset. Although testing is commonly covered in dedicated subjects, it is frequently taught in a theoretical manner, and the practical skill of developing test scripts is not sufficiently practised. Furthermore, in the context of short-term student projects that typically last only a single semester, manual testing is often perceived as a more efficient temporary solution during development. As a result, students tend to prioritise completing functional requirements over establishing a robust testing framework, as the long-term benefits of automated testing seem to be irrelevant to their goals.
-
Ensure the involvement of real usersWhen testing the system, students tend to use unrealistic examples. This was true for all of the various student project groups that worked on AMMBER or its precursors. There not being any usable test data sets, and poor interns had left nothing useful. Student project teams were reluctant to schedule sessions with actual users. The students needed to see the system used by real users of the system. That had two positive effects. One was that students could see meaningful examples. The second was motivating students that their work would be valued. Getting an actual user to test the system also helped capture many small bugs, layout issues and inconveniences that were glossed over when developers were focused on just testing their latest changes.A corollary is that any software system used for a maintenance project should be chosen carefully. Students are motivated by the thought that the software will be used. There is a tradeoff between being used but being on a critical path in a project with the attendant pressure to deliver something.More recent testing has used AMMBER to redraw old projects. What worked easily led to several improvements. We in fact have a large collection of motivational models that have been stored as diagrams in Dropbox. They have been useful for testing, and some teaching and research. They need to be handled appropriately and integrated into a test suite. That is a topic for future work.
-
Consolidate code, don’t let it fractureOne of the earlier interns was tasked with changing the diagram for people when there was more than one stakeholder. It was difficult to fix as there were several places where the shape was stored which caused confusion. When the interns are making changes, there can be an instinctive reflex to copy a piece of code and work on that separately to try to minimize the scope of the changes. In this instance, copying the code caused problems because there should have been a single instance which was referenced everywhere. This is really just an instance of DRY — don’t repeat yourself — but that lesson can be hard to learn.
-
Be aware of software deprecation possibilitiesThere have been some major hiccoughs in development caused by packages used in the project no longer being supported. Specifically, the project had been using the MxGraph package but support for this had been dropped. We were able to move to a replacement package for this MaxGraph but resolving the incompatibilities between the packages were felt like a lot of work for minimal progress.There is also a need to be aware of security alerts that come up in packages used in the application. The nature of the React ecosystem is that even a modestly sized project like this uses a huge number of packages and any of these can be vulnerable. GitHub has started scanning package lists of projects it hosts and sends alerts about vulnerabilities which has been invaluable but also creates work to update the packages.
-
Interns work better than software project teamsIn the eight years since the original motivational model editor was built, five student project teams at the University of Melbourne have extended the project. That includes one team of eight over a whole year as part of a Masters of Software Engineering degree, and four teams of 4-5 students doing a one semester project as part of a Masters of Information Technology degree. There have been around ten cycles of interns working on the project. Mostly the students were undertaking an internship subject, but several were unpaid interns. The interns came from both the University of Melbourne and Swinburne University of Technology. The quality of interns varied considerably. Several of the internships run completely virtually, originally necessitated by COVID restrictions.Our observation is that internships work better than student project teams for maintaining software. There are several reasons. One is that the assessment criteria for the software project unit can get in the way of achieving outcomes for the project, and timelines are more fixed, and again not in alignment with the maintenance needs. There is also more direct mentoring offered to an intern, and the motivation level was usually higher. Internships have in fact led to research projects and other engagements on several occasions.
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Koc, H.; Erdoğan, A.M.; Barjakly, Y.; Peker, S. UML Diagrams in Software Engineering Research: A Systematic Literature Review. In Proceedings of the Proceedings 2021, 2021. [Google Scholar]
- Chung, L.; Nixon, B.; Yu, E.; Mylopoulos, J. Non-FUnctional Requirements in Software Engineering; Springer, 2000. [Google Scholar]
- Yu, E.S. Social modeling and i. Conceptual modeling: Foundations and applications: Essays in honor of John Mylopoulos 2009, 99–121. [Google Scholar]
- Bresciani, P.; Perini, A.; Giorgini, P.; Giunchiglia, F.; Mylopoulos, J. Tropos: An agent-oriented software development methodology. Auton. Agents Multi-Agent Syst. 2004, 8, 203–236. [Google Scholar] [CrossRef]
- Wooldridge, M. An Introduction to MultiAgent Systems - 2nd edition; John Wiley, 2009. [Google Scholar]
- Wooldridge, M.; Jennings, N.R.; Kinny, D. The Gaia methodology for agent-oriented analysis and design. Auton. Agents Multi-Agent Syst. 2000, 3, 285–312. [Google Scholar] [CrossRef]
- Juan, T.; Pearce, A.; Sterling, L. ROADMAP: Extending the Gaia methodology for complex open systems. In Proceedings of the Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1, 2002; pp. 3–10. [Google Scholar]
- Sterling, L.; Taveter, K. The art of agent-oriented modeling; MIT Press, 2009. [Google Scholar]
- Miller, T.; Lu, B.; Sterling, L.; Beydoun, G.; Taveter, K. Requirements elicitation and specification using the agent paradigm: the case study of an aircraft turnaround simulator. IEEE Trans. Softw. Eng. 2014, 40, 1007–1024. [Google Scholar] [CrossRef]
- Mylopoulos, J.; Chung, L.; Yu, E. From object-oriented to goal-oriented requirements analysis. Commun. ACM 1999, 42, 31–37. [Google Scholar] [CrossRef]
- Van Lamsweerde, A. Goal-oriented requirements engineering: A guided tour. In Proceedings of the Proceedings fifth ieee international symposium on requirements engineering. IEEE, 2001; pp. 249–262. [Google Scholar]
- Wilmann, D.; Sterling, L. Guiding agent-oriented requirements elicitation: HOMER. In Proceedings of the Fifth International Conference on Quality Software (QSIC’05), 2005; IEEE; pp. 419–424. [Google Scholar]
- Marshall, J. Agent-based modelling of emotional goals in digital media design projects. In Innovative Methods, User-Friendly Tools, Coding, and Design Approaches in People-Oriented Programming; IGI Global, 2018; pp. 262–284. [Google Scholar]
- Lopez-Lorca, A.; Burrows, R.; Sterling, L. Teaching Motivational Models in Agile Requirements Engineering. In Proceedings of the Proceedings of the Requirements in Education and Training workshop at RE’18, 2018. [Google Scholar]
- Keogh, K.; Sterling, L.; Venables, A. A Scalable and Portable Structure for Conducting Successful Year-Long Undergraduate Software Team Projects. Journal of Information Technology Education 2007, 6(1), 515–540. [Google Scholar] [CrossRef] [PubMed]
- Ouhbi, S. Bridging Course: An Experience Report. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering: Software Engineering Education and Training (ICSESEET), 2024. [Google Scholar]
- Sammut, J.; Sterling, L.; Xiang, X.; Song, Y.; Cao, Y. VENUS: Designing a Validation Engine for User Stories. In Proceedings of the Communications in Computer and Information Science, 2025; Springer: Cham; vol 2263. [Google Scholar]
- Driessen, V. A Successful Git Branching Model. 2010. Available online: https://nvie.com/posts/a-successful-git-branching-model/.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).