K2EBuilding MLOps Environments for Governing Data and Models Catalogues while Tracking Versions

  1. Gorka Zárate
  2. Raúl Miñón
  3. Josu Díaz-de-Arcaya
  4. Ana I. Torre-Bastida
Liburua:
2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C): 12-15 March 2022

Argitaletxea: IEEE Computer Society

ISBN: 978-1-6654-9493-9

Argitalpen urtea: 2022

Biltzarra: IEEE International Conference on Software Architecture Workshops (ICSAW) (19. 2022. null)

Mota: Biltzar ekarpena

DOI: 10.1109/ICSA-C54293.2022.00047 DIALNET GOOGLE SCHOLAR

Laburpena

Nowadays, there are a variety of problems associated with the process of extracting value and information from data such as: data heterogeneity, data distribution, model versioning, and the vast variety of techniques and approaches. Due to all this, the data management process becomes hard to implement in real world scenarios. In this context, the catalogue tools for data and Artificial Intelligence models alleviate the burden of dealing with versioning tasks. Thus, the automation of the data and models’ management processes is facilitated, complying with DataOps and MLOps good practices. This work in progress enumerates key challenges to address when creating these types of catalogues: on the one hand, the management of the diversity of data and models’ internal nature and their different versions, and on the other hand, the provision of adequate meta-information and Governance tools such as access control and auditing. In this paper, the Knowledge to Environment (K2E) platform is presented, whose architecture aims to define the necessary components for the creation of environments that allow working with data and model catalogues. By environment creation, we mean providing a workspace populated with the datasets and models of an organization, while tracking their distinct versions by using specialised catalogues. In addition, this workspace will incorporate added-value tools for governance and auditing. Finally, an approach for implementing K2E is detailed.