Submitted:
07 July 2025
Posted:
08 July 2025
You are already at the latest version
Abstract

Keywords:
1. Introduction
Research Objectives
- To experimentally develop a Retrieval-Augmented Generation (RAG)-based Legal Assistant system named JusticeNetBD focused on providing accessible and accurate legal aid tailored specifically for women.
-
To support legal queries in simple English language during the initial implementation phase, using a curated corpus derived from authoritative Bangladeshi legal sources, including:
- –
- The Bangladesh Penal Code, 1860
- –
- The Women and Children Repression Prevention Act, 2000
- –
- The Dowry Prohibition Act, 2018
- –
- Other relevant acts
- To ensure contextually grounded responses by integrating these legal documents into a retrieval system that feeds into a generative language model.
-
To rigorously evaluate the assistant’s performance using quantitative metrics such as:
- –
- Recall@k and Mean Reciprocal Rank (MRR)
- –
- ROUGE-L
- –
- BERTScore F1
- To benchmark the assistant’s performance against contemporary state-of-the-art and commercial models such as ChatGPT-4o Turbo, Gemini Flash 2.5 and DeepSeek-V3 using response generation metrics and statistical hypothesis testing, thereby assessing its comparative effectiveness in the domain of legal conversational AI.
2. Related Work
2.1. Machine Learning for Safety
2.2. AI in Social Applications for Safety
2.3. Retrieval-Augmented Generation in Legal AI
Research Gaps and Rationale
3. Methodology
3.1. Corpus Construction from Legal and Institutional Sources
- Bangladesh Penal Code, 1860: Selected sections on rape, miscarriage, and wrongful confinement relevant to gender-based violence cases.
- Women and Children Repression Prevention Act, 2000: All sections related to women and child abuse, including punishment, procedural guidelines, and trial procedures in special tribunals.
- Dowry Prohibition Act, 2018: Legal definitions, punishable offenses, and complaint mechanisms concerning dowry-related harassment and violence.
- Domestic Violence (Prevention and Protection) Act, 2010: Documentation related to domestic violance, rights of aggrieved person, custody, and reconciliation procedures.
- Cyber Security Ordinance 2025: Sections related to cyber harassment, threats, defamation, and unauthorized disclosure of private information, particularly relevant to online abuse cases.
- NGO Legal Aid and Support Info: Helpline numbers, and legal aid procedures were extracted from reputable NGOs such as BRAC, Ain o Salish Kendra (ASK), and Bangladesh Legal Aid and Services Trust (BLAST).
3.2. Language Model Configuration for RAG Generation
- Temperature = 0.4: This parameter controls the randomness of token selection. Lower values (closer to 0) make the model more deterministic and focused, reducing variation in outputs. A value of 0.4 is selected to balance factual consistency with slight linguistic diversity in answers.
- Max_tokens = 800: This setting determines the upper bound of tokens in the generated output. It ensures that the chatbot provides sufficiently informative responses without exceeding reasonable interaction length. The value can be tuned by the developer based on response verbosity requirements.
- Stop Sequences and Prompt Formatting: Custom prompt templates were designed to encourage concise, empathetic, and legally accurate outputs. The model was conditioned to avoid speculative or hallucinated statements and to cite retrieved legal segments when available.
3.3. Retrieval-Augmented Generation Workflow
| Algorithm 1 RAG-Based Legal Question Answering Workflow |
![]() |
3.3.1. Corpus Chunking
3.3.2. Embedding with Sentence Transformer
3.3.3. Indexing and Storage
3.3.4. Query Embedding and Context Retrieval
3.3.5. Prompt Construction and Generation
"You are a helpful and knowledgeable AI assistant. Always respond based on the following trusted legal and institutional context: [retrieved text]"
3.3.6. Response Handling and History Update
Illustrative Simulation: A Query–Response Example
- 1.
- The query q is embedded and matched against the FAISS index.
- 2.
-
Two retrieved chunks from the Dowry Prohibition Act 2018 are returned:
- “Any demand for dowry, whether before or after marriage, is punishable by law under Section 3. A woman may file a complaint directly to the nearest police station or seek legal aid.”
- “The penalty includes imprisonment up to five years and/or a monetary fine. Complaints can also be submitted to women’s helpline 109.”
- 3.
- The model is prompted with this legal context and returns:
"Under the Dowry Prohibition Act 2018, demanding dowry after marriage is illegal and punishable. You may file a complaint at your local police station or call the government helpline 109. Legal aid is also available through NGOs like ASK or BLAST."
3.4. Deployment, Safety, and Usage Safeguards
3.5. Performance Evaluation Metrics
3.5.1. Recall@k
3.5.2. Mean Reciprocal Rank (MRR)
3.5.3. ROUGE-L
3.5.4. BERTScore F1
4. Results and Discussion
4.1. Quantitative Results
| Model | Recall@2 | MRR | R-L | BS |
|---|---|---|---|---|
| JusticeNetBD | 0.90 | 0.90 | 0.463 | 0.896 |
| DeepSeek V3 (Closed-Book) | n/a | n/a | 0.210 | 0.850 |
| Gemini Flash 2.5 (Closed-Book) | n/a | n/a | 0.242 | 0.863 |
| ChatGPT 4o-Turbo (Closed-Book) | n/a | n/a | 0.221 | 0.862 |
- Mitigate hallucination: ensuring that every generated claim can be traced to retrieved source text.
- Enhance lexical fidelity: resulting in improved ROUGE-L scores due to accurate legal phrasing.
- Preserve linguistic fluency: maintaining BERTScore values comparable to human-generated responses.
4.2. Inference Time Analysis
Statistical Significance Test
- k is the number of groups,
- is the number of observations in group i,
- is the sum of ranks for group i,
- N is the total number of observations across all groups.
Analysis of Results
-
JusticeNetBD vs. Other Models:
- –
- Shows highly significant differences compared to ChatGPT-4o Turbo () and DeepSeek-V3 ()
- –
- No significant difference from Gemini Flash 2.5 ()
-
Gemini Flash 2.5 Performance:
- –
- Significantly faster than DeepSeek-V3 ()
- –
- Marginally faster than ChatGPT-4o Turbo (, not significant at )
-
Top-tier Models Comparison:
- –
- No significant difference between ChatGPT-4o Turbo and DeepSeek-V3 ()
5. Conclusion
Appendix A. User Prompts for Model Evaluation
- PROMPT 1:
-
Dowry Demand"My groom demanded money after our marriage. Is it a crime?"
- PROMPT 2:
-
Attempted Murder"A man in the streets attempted to poison me to death. What punishments will he get?"
- PROMPT 3:
-
Kidnapping for Ransom"My child is detained for ransom. What will be the punishment of the kidnapper?"
- PROMPT 4:
-
Gang Rape"What is the punishment for a man who gang rapes a woman?"
- PROMPT 5:
-
Psychological Abuse"A boy told me that he would not touch me, abuse me, but do something that will force me to do suicide! Is that a crime according to the law?"
- PROMPT 6:
-
Non-consensual Contact"A man touched me without my consent. Is that a punishable crime?"
- PROMPT 7:
-
Dowry-related Murder Attempt"My husband’s family tried to murder me as my family does not want to give them dowry. What punishment will my husband’s family get?"
- PROMPT 8:
-
Child Maiming"What is the punishment for a man who disables a child’s hand to make him a beggar?"
- PROMPT 9:
-
Child of Rape"A child born due to the results of rape. Who will take care of the child?"
- PROMPT 10:
-
Victim Identification"I am a victim according to the Women and Children Repression Prevention Act. A newspaper published my identity regarding this. Is that a crime by the publishers?"
References
- Bangladesh Bureau of Statistics. (2016). Report on violence against women survey. Ministry of Planning, Government of Bangladesh.
- Ain o Salish Kendra. (2022). Annual report on human rights.
- Centre for Policy Dialogue. (2021). Digital rights and safety of women in Bangladesh. CPD Working Paper Series.
- Ahmmed, M. E. (2023). Access to justice for illiterate women in the southern char areas of Bangladesh. SSRN Working Paper.
- Naznin, S. M. (2020). Women’s rights to access to justice: The role of public interest litigation in Bangladesh. Australian Journal of Asian Law, 21, 99–116.
- Human Rights Watch. (2020). Why is it so difficult for Bangladeshi women to get justice?
- Rahman, M. A. (2024). Understanding the developmental approach to legal pluralism and access to justice in Bangladesh. Australian Journal of Asian Law, 25, 17–34.
- Begum, A., & Saha, N. K. (2017). Women’s access to justice in Bangladesh: Constraints and way forward. Journal of Malaysian and Comparative Law, 44, 39–58.
- UN Women. (2020). UN Women Bangladesh.
- National Legal Aid Services Organization. (2021). Annual legal aid report. Ministry of Law, Justice and Parliamentary Affairs, Bangladesh.
- Nadarzynski, T., Miles, O., Cowie, A., & Ridge, D. (2019). Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: A mixed-methods study. Digital Health, 5, 2055207619871808. [CrossRef]
- Dede, C. (1986). A review and synthesis of recent research in intelligent computer-assisted instruction. International Journal of Man-Machine Studies, 24(4), 329–353. [CrossRef]
- Rajendran, R. K., Vetrivel, S., & NR, W. B. (2025). The role of AI in enhancing access to justice and legal services. In Exploration of AI in contemporary legal systems (pp. 139–162). IGI Global.
- Chien, C. V., & Kim, M. (2024). Generative AI and legal aid: Results from a field study and 100 use cases to bridge the access to justice gap. Loyola of Los Angeles Law Review, 57, 903–940.
- Mariani, M. M., Hashemi, N., & Wirtz, J. (2023). Artificial intelligence empowered conversational agents: A systematic literature review and research agenda. Journal of Business Research, 161, 113838. [CrossRef]
- Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Riedel, S. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459–9474.
- Maynez, J., Narayan, S., Bohnet, B., & McDonald, R. (2020). On faithfulness and factuality in abstractive summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 1906–1919).
- Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. (2020). Retrieval augmented language model pre-training. In Proceedings of the 38th International Conference on Machine Learning (pp. 3749–3761).
- Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., ... & Yih, W.-t. (2020). Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (pp. 6769–6781).
- Tiwari, A. K., & Marisport, A. (2024). Leveraging artificial intelligence to address domestic violence against women with disabilities in India. In Proceedings of the 2024 International Conference on AI for Good (pp. 1–6). IEEE.
- Ketnoi, N., Daenglim, T., & Chaisiriprasert, P. (2024, November). A new approach for sentiment analysis of sexual harassment in Thai sentences using transformer models. In 2024 8th International Conference on Information Technology (InCIT) (pp. 1–6). IEEE.
- Chavez, C. V., Ruiz, E., Rodriguez, A. G., Pena, I. R., Larios, V. M., Villanueva-Rosales, N., & Cheu, R. L. (2019, October). Towards improving safety in urban mobility using crowdsourcing incident data collection. In 2019 IEEE International Smart Cities Conference (ISC2) (pp. 626–631). IEEE.
- Aldkheel, A. (2024). Design and evaluation of a conversational agent for supporting domestic violence survivors (Doctoral dissertation, University of North Carolina at Charlotte).
- Al-Shaikh, A., AlAlfi, A. H., Al-Nsour, E. Y. A., & others. (2024). Using artificial intelligence techniques in law enforcement: A survey. In Arab Conference on Smart Systems. IEEE.
- Cenci, A. (2025). Citizen science and negotiating values in the ethical design of AI-based technologies targeting vulnerable individuals. AI and Ethics. [CrossRef]
- Panadés, R., & Yuguero, O. (2025). Cyber-bioethics: The new ethical discipline for digital health. Frontiers in Digital Health, 4, 1523180. [CrossRef]
- Ibrahim, A. M., Okesanya, O. J., Ukoaka, B. M., & Ahmed, M. M. (2025). Harnessing artificial intelligence to address diseases attributable to unsafe drinking water. Discover Water, 5, 1–6. [CrossRef]
- Parihar, M. R., & Koolwal, M. (2025). Ethical and legal implications of using AI for predictive policing in child offenses. International Journal of Environmental Safety, 11, 1201–1213.
- Kalra, R., Wu, Z., Gulley, A., Hilliard, A., Guan, X., Koshiyama, A., & Treleaven, P. (2024). HyPA-RAG: A Hybrid Parameter Adaptive Retrieval-Augmented Generation System for AI Legal and Policy Applications. arXiv preprint arXiv:2409.09046.
- Rafat, M. I. (2024). AI-powered Legal Virtual Assistant: Utilizing RAG-optimized LLM for Housing Dispute Resolution in Finland.
- Lee, H. H., Chen, C. C., & Yen, A. Z. (2025, May). RAG-Enhanced Evidence Recommendation in Financial Legal Resolutions. In Companion Proceedings of the ACM on Web Conference 2025 (pp. 1096-1099).
- Schwarcz, D., Manning, S., Barry, P., Cleveland, D. R., Prescott, J. J., & Rich, B. (2025). Ai-powered lawyering: Ai reasoning models, retrieval augmented generation, and the future of legal practice.
- Amato, F., Cirillo, E., Fonisto, M., & Moccardi, A. (2024, December). Optimizing Legal Information Access: Federated Search and RAG for Secure AI-Powered Legal Solutions. In 2024 IEEE International Conference on Big Data (BigData) (pp. 7632-7639).
- Grattafiori, A., Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., ... & Vasic, P. (2024). The LLaMA 3 herd of models. arXiv Preprint arXiv:2407.21783.
- Simanto, S. H. (2025). JusticeNetBD [Computer software]. GitHub. https://github.com/SakibHasanSimanto/JusticeNetBD.








| Question | DeepSeek-V3 | Gemini Flash 2.5 | ChatGPT-4o Turbo | JusticeNetBD |
| 1 | 17.52 | 7.88 | 5.13 | 1.76 |
| 2 | 19.74 | 3.15 | 11.11 | 1.79 |
| 3 | 18.36 | 2.89 | 18.13 | 1.25 |
| 4 | 20.26 | 2.51 | 19.76 | 1.40 |
| 5 | 17.32 | 4.18 | 17.49 | 1.51 |
| 6 | 18.23 | 3.39 | 20.11 | 1.50 |
| 7 | 18.95 | 2.49 | 19.18 | 1.31 |
| 8 | 21.12 | 2.94 | 16.42 | 1.63 |
| 9 | 19.54 | 3.28 | 16.88 | 1.25 |
| 10 | 20.59 | 3.75 | 14.59 | 1.63 |
| Model | Mean Time (s) | Std Dev | Standard Error (95% CI) |
|---|---|---|---|
| DeepSeek-V3 | 19.16 | 1.23 | 0.41 |
| Gemini Flash 2.5 | 3.65 | 1.49 | 0.50 |
| ChatGPT-4o Turbo | 15.88 | 4.40 | 1.46 |
| JusticeNetBD | 1.50 | 0.19 | 0.06 |
| ChatGPT-4o Turbo | DeepSeek-V3 | Gemini Flash 2.5 | JusticeNetBD | |
|---|---|---|---|---|
| ChatGPT-4o Turbo | 1.000000 | 1.000000 | 0.130230 | 0.000142 |
| DeepSeek-V3 | 1.000000 | 1.000000 | 0.004257 | 0.000000 |
| Gemini Flash 2.5 | 0.130230 | 0.004257 | 1.000000 | 0.320128 |
| JusticeNetBD | 0.000142 | 0.000000 | 0.320128 | 1.000000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
