Fine-tuning and Evaluation Pipeline for a Geotechnical Large Language Model

We are seeking a capable placement student to deliver the machine-learning component of a QAA-funded project developing a discipline-aware AI tool for engineering education. The role supports a collaboration between the Universities of Cardiff, Manchester, Surrey, and Glasgow that fine-tunes open-weight language models on curated geotechnical material and evaluates them against a domain benchmark, with training on Cardiff's ARCCA HPC/GPU infrastructure. In practical terms, the finished tool will be a specialist AI assistant for geotechnical engineering, used by undergraduate students across the four partner universities as a reliable, source-grounded alternative to general-purpose chatbots such as ChatGPT.

Key Responsibilities:

Model Fine-tuning (55%)

Implement a parameter-efficient fine-tuning pipeline (LoRA / QLoRA) for an open-weight large language model
Manage dataset formatting, tokenisation, checkpointing, and configuration management for reproducibility
Iterate over training hyperparameters against the domain evaluation set to optimise model performance
Monitor GPU utilisation and cost

Evaluation Harness (30%)

Build an automated evaluation harness that runs the domain benchmark through base and fine-tuned models
Report scores under the five-tier taxonomy and support inter-metric analysis alongside the benchmark intern
Maintain reproducible evaluation runs with version-controlled configurations and seeded randomness

Documentation and Reporting (15%)

Produce reproducible training configurations, model cards, and release notes
Contribute to project interim outputs, progress reports, and dissemination materials

Required Skills and Attributes

Background in computer science, software engineering, data science, or a closely related engineering discipline
Strong Python programming and reproducible software-engineering discipline, including version control and dependency management
Familiarity with PyTorch or an equivalent deep-learning framework

Desirable Experience

Prior exposure to Hugging Face Transformers, PEFT / LoRA, or similar fine-tuning toolkits
Experience running workloads on HPC or GPU infrastructure
Exposure to large language model evaluation or benchmarking

Support and Development

The successful candidate will work under direct supervision of Dr Evan Ricketts and Dr Fei Jin, with technical support from Cardiff ARCCA. This position represents an excellent opportunity for computer science students to gain hands-on experience of open-weight LLM fine-tuning, reproducible ML engineering, and contribute to a peer-reviewed research output. Interns will be named in project outputs and acknowledged in any research publications arising from their contributions, with the opportunity for co-authorship on strong individual contributions.

Connect with a cause that needs you!

Intern Placement QAA project Fine-tuning and Evaluation Pipeline for a Geotechnical LLM ER