K2EBuilding MLOps Environments for Governing Data and Models Catalogues while Tracking Versions
- Gorka Zárate
- Raúl Miñón
- Josu Díaz-de-Arcaya
- Ana I. Torre-Bastida
Éditorial: IEEE Computer Society
ISBN: 978-1-6654-9493-9
Année de publication: 2022
Congreso: IEEE International Conference on Software Architecture Workshops (ICSAW) (19. 2022. null)
Type: Communication dans un congrès
Résumé
Nowadays, there are a variety of problems associated with the process of extracting value and information from data such as: data heterogeneity, data distribution, model versioning, and the vast variety of techniques and approaches. Due to all this, the data management process becomes hard to implement in real world scenarios. In this context, the catalogue tools for data and Artificial Intelligence models alleviate the burden of dealing with versioning tasks. Thus, the automation of the data and models’ management processes is facilitated, complying with DataOps and MLOps good practices. This work in progress enumerates key challenges to address when creating these types of catalogues: on the one hand, the management of the diversity of data and models’ internal nature and their different versions, and on the other hand, the provision of adequate meta-information and Governance tools such as access control and auditing. In this paper, the Knowledge to Environment (K2E) platform is presented, whose architecture aims to define the necessary components for the creation of environments that allow working with data and model catalogues. By environment creation, we mean providing a workspace populated with the datasets and models of an organization, while tracking their distinct versions by using specialised catalogues. In addition, this workspace will incorporate added-value tools for governance and auditing. Finally, an approach for implementing K2E is detailed.