A framework for the operationalization of analytic workloads in complex distributed computing environments
dc.contributor.advisor | Almeida, Aitor | es_ES |
dc.contributor.advisor | Torre Bastida, Ana Isabel | es_ES |
dc.contributor.author | Díaz de Arcaya Serrano, Josu | es_ES |
dc.contributor.other | Facultad de Ingeniería, Programa de Doctorado en Ingeniería para la Sociedad de la Información y Desarrollo Sostenible por la Universidad de Deusto | es_ES |
dc.date.accessioned | 2024-10-22T14:20:30Z | |
dc.date.available | 2024-10-22T14:20:30Z | |
dc.date.issued | 2024-04-24 | |
dc.description.abstract | The use of AI-based technologies to improve business processes and competitiveness in an increasingly globalized market is on the rise. However, the success of these projects is still far from desired. In this thesis, we focus on various phases of the life cycle of machine learning technologies and develop technologies and applications that support professionals in increasing the success of these projects. For this, a culture of collaboration and communication within the organization is crucial, as is promoting skills training that goes beyond traditional software development. Moreover, organizations should focus on data lifecycle systematization. In this regard, the aim is to model the flow of the processes that make up the life cycle of machine learning applications through explicit representation. Therefore, one of the technical contributions of this thesis has been a domain-specific language that abstracts data scientists from the more technical aspects relating to the deployment and operationalization of artificial intelligence processes. This encourages these professionals to focus on obtaining business value from data while reducing the effort they must make in other areas of knowledge. Furthermore, other teams involved in the operationalization process gain greater clarity about machine learning processes, thereby increasing the efficiency of the project. On the other hand, the emergence of distributed computing paradigms such as cloud computing, combined with the enormous heterogeneity of devices on the edge of the network, makes even industry experts struggle during deployment. In this regard, we have developed a tool that uses genetic algorithms to optimize the deployment of machine learning flows based on opposing goals such as resilience, model performance, cost, and network performance. In addition, privacy and model performance criteria are considered. We have demonstrated that this tool achieves better results than experts in the field in all the objectives evaluated. Finally, this thesis opens various lines of research, such as the orchestration of services in 5G networks, the implementation of monitoring agents capable of anticipating and correcting problems, and the development of tools based on artificial intelligence in other phases of the life cycle, such as monitoring or training. | es_ES |
dc.identifier.uri | http://hdl.handle.net/20.500.14454/1596 | |
dc.publisher | Universidad de Deusto | es_ES |
dc.subject | Matemáticas | es_ES |
dc.subject | Ciencia de los ordenadores | es_ES |
dc.subject | Bancos de datos | es_ES |
dc.subject | Matemáticas | es_ES |
dc.subject | Ciencia de los ordenadores | es_ES |
dc.subject | Inteligencia artificial | es_ES |
dc.title | A framework for the operationalization of analytic workloads in complex distributed computing environments | es_ES |
dc.type | Tesis | es_ES |