Article
Article name	Metaethics of Artificial Intelligence: Formalization of Value Invariants in its Architecture
Authors	Nevedov L.E. postgraduateTransbaikal State University, lnevedov@mail.ru
Bibliographic description	Nevedov LE. Metaethics of Artificial Intelligence: Formalization of Value Invariants in its Architecture. Humanitarian Vector. 2026;21(1):17–26. (In Russian). https://www.doi.org/10.21209/1996-7853-2026-21-1-17-26
Section	Socium Axiology
UDK	УДК 17:004.8
DOI	https://www.doi.org/10.21209/1996-7853-2026-21-1-17-26
Article type	Original article
Annotation	With the expanding use of artificial intelligence systems in socially significant areas, the need for a philosophical analysis of not only their functional effectiveness but also their internal value structure is growing. Most existing approaches to the ethics of artificial intelligence rely on external normative constraints and regulatory principles, which prove insufficient given the increasing autonomy and cognitive integration of modern intelligent systems. The goal of this study is to develop a theoretical model for the meta-ethical analysis of artificial intelligence, allowing for the identification of its moral consistency at the cognitive, value, and social levels. The study’s hypothesis is that increasing the cognitive integration of artificial intelligence without a comparable development of value and social components leads to systemic ethical inconsistency. The methodological basis of the study is the philosophy of information, integrated information theory, the concept of correctability, and the human oversight model. The study utilizes a comparative case study of algorithmic recommendation systems, generative models, and intelligent medical decision support systems. As a result, an interdisciplinary model of the metaethics of artificial intelligence architecture is proposed, within which the architectural morality of intelligent systems is described as a product of cognitive integration, value orientation, and social context. Based on an analysis of empirical cases, it is revealed that intelligent systems with a high level of cognitive integration and insufficient value and social coherence are characterized by increased risks of ethical failures, including increased bias, shifting responsibility, and reduced interpretability of decisions. The scientific novelty of this study lies in the development of a metaethical approach to the analysis of artificial intelligence architecture as a carrier of internal value parameters. The obtained results allow us to consider the proposed model as a conceptual basis for the ethical audit of artificial intelligence systems. Prospects for further research relate to the development of formalized indicators of ethical coherence, the integration of metaethical principles into engineering practices, and the development of the concept of architectural humanism.
Key words	artificial intelligence metaethics, integrated information, architectural responsibility, cognitive coherence, value alignment, architectural humanism
Article information
References	Toffler A. The Third Wave. New York: Bantam Books; 1980. 544 p. Bostrom N. Superintelligence: Paths, Dangers, Strategies. Oxford: Oxford University Press; 2014. 352 p. Rationality: From AI to Zombies. Berkeley: Machine Intelligence Research Institute; 2015. 1800 p. Available at https://intelligence.org/rationality-ai-zombies (accessed 15.10.2025). Floridi L. The Ethics of Information. Oxford: Oxford University Press; 2013. 256 p. Tononi G. Consciousness as Integrated Information: A Provisional Manifesto. Biological Bulletin. 2008;215(3):216-242. DOI: 10.2307/25470707 Tegmark M. Life 3.0: Being Human in the Age of Artificial Intelligence. London: Penguin Books; 2018. 384 p. Dehaene S, Lau H, Kouider S. What is consciousness, and could machines have it? Science. 2017;358(6362):486-492. DOI: 10.1126/science.aan8871 Cowls J, Floridi L. A Unified Framework of Five Principles for AI in Society. Harvard Data Science Review. 2019;1(1):1-15. DOI: 10.1162/99608f92.8cd550d1 Gabriel I. Artificial Intelligence, Values, and Alignment. Minds and Machines. 2020;30:411-437. DOI: 10.1007/s11023-020-09539-2 Russell S. Human Compatible: Artificial Intelligence and the Problem of Control. New York: Viking; 2019. 325 p. DOI: 10.1007/978-3-030-86144-5_3 Principles for Conscious AI. Conscium; 2024. 42 p. Available at https://conscium.com (accessed 15.10.2025). Yampolskiy RV. Uncontrollability of AI Systems. Proceedings of the AI Safety Workshop at IJCAI-21. Aachen: CEUR Workshop Proceedings. 2021;2916:1-10. DOI: 10.13140/RG.2.2.35055.66727 Hendrycks D, Burns C, Basart S, Critch A, Li J, Song D, Steinhardt J. Aligning AI with Human Values; 2021. Available at https://arxiv.org/abs/2008.02275 (accessed 15.10.2025). DOI: 10.48550/arXiv.2008.02275 Amodei D., Olah C., Steinhardt J., Christiano P., Schulman J., Mane D. Concrete Problems in AI Safety; 2016. (ed. 2021). Available at https://arxiv.org/abs/1606.06565 (accessed 15.10.2025). DOI: 10.48550/arXiv.1606.06565 Mitchell M. AI’s challenge of understanding the world. Science. 2023:1-15. DOI: 10.1126/science.adm8175 Bommasani R. On the Opportunities and Risks of Foundation Models; 2021. Available at https://arxiv.org/abs/2108.07258 (accessed 15.10.2025). DOI: 10.48550/arXiv.2108.07258 Shen T. Large Language Model Alignment: A Survey; 2023. Available at https://arxiv.org/abs/2309.15025 (accessed 15.10.2025). DOI: 10.48550/arXiv.2309.15025 Lu Q, Zhu L, Xu X, Whittle J, Zowghi D, Jacquet A. Responsible AI Pattern Catalogue: A Collection of Best Practices for AI Governance and Engineering. ACM Computing Surveys. 2024;56(7):1-38. DOI: 10.1145/3626234 Ribeiro MT, Singh S, Guestrin C. “Why Should I Trust You?ˮ: Explaining the Predictions of Any Classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM; 2016. P. 1135-1144. DOI: 10.1145/2939672.2939778 Doshi-Velez F, Kim B. Toward a Rigorous Science of Interpretable Machine Learning; 2017. (ed. 2021). Available at https://arxiv.org/abs/1702.08608 (accessed 15.10.2025). DOI: 10.48550/arXiv.1702.08608 Miller T. Explanation in Artificial Intelligence: Insights from Social Science. Artificial Intelligence. 2019;267:1-38. DOI: 10.1016/j.artint.2018.07.007 Cinell M, Morales G, Galeazzi A, Quattrociocchi W, Starnini M. The Echo Chamber Effect on Social Media. Nature Communications. 2021;12:1-12. DOI: 10.1073/pnas.2023301118 Lewandowsky S, Kozyreva A, Smillie L, Hertwig R. Technology and Democracy: Understanding the Influence of AI. Nature Human Behaviour. 2022;6:1-13. 23. DOI: 10.2760/709177 Fischer K. Tracking Anthropomorphizing Behaviour in Human-Robot Interaction. ACM Transactions on Human-Robot Interaction. 2021;10(2):1-25. DOI: 10.1145/3442677 Floridi L, Taddeo M. What Is Data Ethics? Philosophical Transactions of the Royal Society A. 2021;379:1-11. DOI: 10.1098/rsta.2016.0360 Markovic S, Popovic G, Andjelkovic L. Mastering the Attention Economy: Strategies for Competing on Digital Platforms. Fusion of Multidisciplinary Research An International Journal. 2024;5:68-578. DOI: 10.63995/PGJE3232 Graziano M. A Conceptual Framework for Consciousness in Machines. Journal of Cognitive Neuroscience. 2022;34(2):280-299. DOI: 10.1073/pnas.2116933119 Spivey MJ. Cognitive Science Progresses Toward Interactive Frameworks. Topics in Cognitive Science. 2023;15(2):219-254. DOI: 10.1111/tops.12645
Full article	Metaethics of Artificial Intelligence: Formalization of Value Invariants in its Architecture
	0
	9