Founded in 1999, the Knowledge Discovery and Machine Learning research group (Descoberta do Conhecimento e Aprendizagem de Máquina, DCAM) of PPGIa/PUCPR is currently composed of nine professors. The group conducts theoretical and applied research in machine learning, big data analytics, natural language processing, information retrieval, and computer vision. A detailed description of the group’s scientific production is available here (in Portuguese).
The group is constantly recruiting graduate students (Masters and PhD) to work on research and development projects in partnership with public and private companies. Interested applicants should explore the team page for more information about the group staff and its research topics.
This line of research aims to advance the state-of-the-art of machine learning themes, a subfield of Artificial Intelligence that studies techniques to give the computer the ability to learn from examples using induction and employ the learned knowledge on new examples. Considering different real-world applications, data, and types of learning, including supervised, unsupervised, and semi-supervised learning, our research group focuses on classification, clustering, association, and regressions tasks. Among the studied themes, the following stand out: generation, selection, and fusion of classifiers, stream learning, representation learning, and deep learning;
This line of research encompasses theoretical and practical advances in different data analysis branches given different Big Data scenarios and tools. These scenarios are characterized by massive amounts of potentially unstructured data which are made available over time and under high speed, which culminate in the need of specific algorithms and techniques for social, market, and industry advances.
In this line of research our goal is to advance the state of the art for textual data processing focusing on Brazilian Portuguese. This type of data is everyday more pervasive in the Internet (social media, recommendation websites, etc) and in industry. This area has different research gaps and opportunities given its naturally ambiguous and noisy characteristics.
This research line focuses on enabling machines to process, analyze, and extract meaningful information from images and videos. It involves tasks such as object detection, image classification, and segmentation. Additionally, various data sources, such as audio and text, can be integrated into multimodal approaches to enhance understanding and decision-making. Key applications of interest include smart cities (e.g., vehicle identification and parking lot monitoring), healthcare (e.g., medical imaging), and biometrics (e.g., face recognition). The challenges addressed in this field include data acquisition and annotation, few-shot learning, and domain adaptation.