Angel Daza CV

My Skills

Through the years in my professional career, I have worked as a Software Developer, Data Scientist and Researcher. My main focus is on Natural Language Processing (NLP), a technology that aims to help computers understand and interact with humans thorugh the use of language. This has allowed me to develop several technical, analytical and soft skills. My main programming language is Python and my research expertise is broadly on multilingual neural models, semantic analysis, NLP for digital humanities and evaluation of NLP software.

Learn more

Work Experience

I have worked in Industry Jobs, in Academia and also as a Freelancer. I really like building fast prototypes to test new ideas and implement the latest technologies to solve "old school" problems. I've been building NLP models since the use of Recurrent Neural Networks, LSTMs and thorugh the transition into Transformer models; and I have been of course working with the latest Large Language Model tools (however this work is mostly on evaluating how much of the hype out there is actually useful!) I also enjoy working on Digital Humanities, using my tech expertise to help researchers in the humanities analyze big amounts of textual data to answer their questions

Learn more

Education

My studies allowed me to obtain interesting degrees and also get to know people from different places and cultures. I did my Bachelors and Masters in Mexico City, with internships in Madrid and Tokio, summer schools in Tubingen and Bilbao; a PhD in Heidelberg and a Postdoc in Amsterdam.

Learn more

Selected Publications

I have published as a first author and co-author in top-tier conferences from Computational Linguistics, Computer Science and Digital Humanities venues, such as ACL, EMNLP and NAACL

Learn more

Work Experience

Netherlands eScience Center

Research Software Engineer
January 2024 - Present

I am currently working as an NLP Engineer in different academic projects. I collaborate with Impact and Fiction and The Semantics of Sustainability

Learn more

Vrije Universiteit Amsterdam

NLP Researcher (Postdoc)
April 2021 - December 2023

As part of the CLTL group, I focused on the development of NLP models and text mining strategies for automatic processing of biographical texts. This included dynamic language analysis of biographies and evaluation of State of the Art models' performance. This work was part of the InTaVia project, which aimed to address major research challenges and bridge the semantic gap between large object databases, biography databases, and users across Europe.

Learn more

Leibniz Institut für Deutsche Sprache

NLP Developer
August 2020 - February 2021

Implementation of SOTA lemmatizers and part-of-speech taggers for large-scale (~50 billion tokens) German resources.

Learn more

Universität Heidelberg

PhD Researcher
April 2017 - July 2020

My research was concerned on finding effective methods for creating more training data for the task of Semantic Role Labeling (SRL) in several languages (but particularly German). I approach SRL as a sequence classification task and also as a sequence generation task. Through my research I have worked with Recurrent Neural Networks , LSTM's , Sequence-to-Sequence models and multilingual Neural Language Models such as ELMo and BERT. I implemented my research code in Pytorch and also used state-of-the-art frameworks such as SpaCy , Transformers and AllenNLP when building more complex models.

Learn more

Education

Universität Heidelberg

Doctor of Philosophy

Computational Linguistics / Natural Language Processing

Thesis: Cross-lingual Semantic Role Labeling through Translation and Multilingual Learning

Supervisor: Prof. Dr. Anette Frank

Instituto Politécnico Nacional - Centro de Investigación en Computación (CIC)

Master of Science

Computer Science

Thesis: Automatic Text Generation by Learning from Literary Structures

Supervisor: Prof. Dr. Hiram Calvo

Tecnológico de Monterrey

Bachelor of Science

Computer Systems Engineer

Specialization in Artificial Intelligence

Selected Publications

In the Context of Narrative, we Never Properly Defined the Concept of Valence

Boot P., Daza, A., Schnober, C., van Hage, W. (2024).

CHR 2024: Computational Humanities Research Conference
Aarhus, Denmark
Learn more

Choosing the Right Tool for You: Informed Evaluation of Text Analysis Tools

Daza, A., Fokkens, A. (2024).

CLARIN Annual Conference Proceedings (CLARIN 2024)
Barcelona, Spain
Learn more

Confidently Wrong: Exploring the Calibration and Expression of (Un)Certainty of Large Language Models in a Multilingual Setting

Krause, L., Tufa, W., Baez-Santamaria, S., Daza, A., Khurana, U., Vossen, P. (2023).

Proceedings of the Workshop on Multimodal, Multilingual Natural Language Generation and Multilingual WebNLG Challenge (MM-NLG 2023)
Prague, Czech Republic
Learn more

Dealing with Abbreviations in the Slovenian Biographical Lexicon

Daza, A., Fokkens, A., Erjavec, T. (2022).

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)
Abu Dhabi, UAE
Learn more

Weisfeiler-Leman in the bamboo: Novel AMR graph metrics and a benchmark for AMR graph similarity

Opitz, J., Daza, A., Frank, A. (2021)

Transactions of the Association for Computational Linguistics
TACL Journal
Learn more

X-SRL: A Parallel Cross-Lingual Dataset for Semantic Role Labeling

Daza, A. and Frank, A. (2020).

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)
Online-Only
Learn more

Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling

Daza, A. and Frank, A. (2019).

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019)
Hong Kong, China
Learn more

A Sequence-to-Sequence Model for Semantic Role Labeling.

Daza, A. and Frank, A. (2018).

Proceedings of the 3rd Workshop on Representation Learning for NLP (RepL4NLP) - ACL 2018
Melbourne, Australia
Learn more

Automatic Text Generation by Learning from Literary Structures

Daza A., Calvo H., Figueroa-Nazuno J. (2016).

Proceedings of the Workshop on Computational Linguistics for Literature colocated with the North American Chapter of the Asociation of Computational Linguistics - NAACL 2016
San Diego, California
Learn more

Dr. José Angel Daza Arévalo

Amsterdam, The Netherlands

My Skills

Work Experience

Education

Selected Publications

Work Experience

Netherlands eScience Center

Research Software EngineerJanuary 2024 - Present

Vrije Universiteit Amsterdam

NLP Researcher (Postdoc)April 2021 - December 2023

Leibniz Institut für Deutsche Sprache

NLP DeveloperAugust 2020 - February 2021

Universität Heidelberg

PhD ResearcherApril 2017 - July 2020

Education

Universität Heidelberg

Doctor of Philosophy

Instituto Politécnico Nacional - Centro de Investigación en Computación (CIC)

Master of Science

Tecnológico de Monterrey

Bachelor of Science

Selected Publications

In the Context of Narrative, we Never Properly Defined the Concept of Valence

Boot P., Daza, A., Schnober, C., van Hage, W. (2024).

Choosing the Right Tool for You: Informed Evaluation of Text Analysis Tools

Daza, A., Fokkens, A. (2024).

Confidently Wrong: Exploring the Calibration and Expression of (Un)Certainty of Large Language Models in a Multilingual Setting

Krause, L., Tufa, W., Baez-Santamaria, S., Daza, A., Khurana, U., Vossen, P. (2023).

Dealing with Abbreviations in the Slovenian Biographical Lexicon

Daza, A., Fokkens, A., Erjavec, T. (2022).

Weisfeiler-Leman in the bamboo: Novel AMR graph metrics and a benchmark for AMR graph similarity

Opitz, J., Daza, A., Frank, A. (2021)

X-SRL: A Parallel Cross-Lingual Dataset for Semantic Role Labeling

Daza, A. and Frank, A. (2020).

Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling

Daza, A. and Frank, A. (2019).

A Sequence-to-Sequence Model for Semantic Role Labeling.

Daza, A. and Frank, A. (2018).

Automatic Text Generation by Learning from Literary Structures

Daza A., Calvo H., Figueroa-Nazuno J. (2016).

Get in touch

Research Software Engineer
January 2024 - Present

NLP Researcher (Postdoc)
April 2021 - December 2023

NLP Developer
August 2020 - February 2021

PhD Researcher
April 2017 - July 2020