Applied AI Agent Developer candidate

I turn ambiguous AI ideas into working agent systems with tools, retrieval, evidence, and evaluation.

My current focus is ChainRisk Agent, an evaluation-backed Web3 risk triage prototype that separates deterministic risk scoring from LLM-generated summaries.

About

Engineering profile

I am completing an engineering master's degree at the University of Sheffield after a dual-degree background in automatic control systems. My practical work sits at the intersection of LLM applications, RAG, backend APIs, and system safety.

For AI Agent roles, I position myself as a hands-on builder: I can break a fuzzy task into input routing, tool calls, retrieval, deterministic checks, structured output, logs, and evaluation cases.

Experience

RAG and applied AI work

2025.02 - 2025.08

RAG Engineer Intern, Robin AI

Worked on retrieval-augmented question answering for contract and legal-document workflows, focusing on document preprocessing, retrieval quality, citation-aware prompting, and failure analysis for internal evaluation.

  • Handled document structure concerns such as OCR text, natural sections, tables, and bullet-style content.
  • Explored query expansion, hybrid retrieval, reranking, and citation-aware answer generation.
  • Kept claims evidence-bound: public resume metrics should be backed by internal reports before being presented as production impact.

Evidence

What can be verified locally

23

Unit tests

Coverage for safety blocking, missing data, retrieval metadata, public labels, and risk signal fusion.

10

Workflow cases

Local evaluation cases for risk levels, evidence, unsupported claims, and latency tracing.

6

RAG cases

Small retrieval smoke test covering liquidity, rug pull, honeypot, holder concentration, phishing, and failed transaction risk.

70%

Real-CATS recall

Dataset-specific behavior-fusion baseline on held-out addresses; useful signal, not a live fraud-detection claim.

Skills

Working stack

Languages and backend

Python, FastAPI, REST APIs, schema design, unit tests, local smoke servers.

LLM systems

RAG, Agent Workflow, Prompt Engineering, Hybrid Retrieval, Reranking, Evaluation.

Web3 risk tooling

Read-only tooling, public label benchmark, RiskSignal fusion, safety guardrails.