Machine Learning for Socio-Technical Systems Lab (ML4STS), University of Rhode Island
Benchmarking LLM-Agents at Fair Machine Learning¶
As a software engineer, researcher, and LLM-Agent maintainer, I contributed to developing a benchmarking framework for evaluating large language model agents in the context of fairness and reliability. My role involved maintaining and optimizing agentic AI systems, managing batch job scheduling with SLURM, optimizing CUDA-based computation, and contributing to a forthcoming research paper.
This experience deepened my understanding of agentic AI systems and the challenges of evaluating fairness and performance in large language models.
Skills: Python, HPC, SLURM, CUDA, AI Systems, Benchmarking, Research