Ashik Syed Shaffiullah

AI Engineer | Machine Learning Engineer
Email: ashiksyedshaffiullah@gmail.com
Mobile: +1-312-900-7691

Summary

Founding ML Engineer (2+ yrs) who ships production LLM & RAG systems—autonomous bug-fixing agent (32% SWE-Bench), $200 K/mo RAG miner, and AI tools that 3× code-triage throughput.

Experience

Founding Machine Learning Engineer (Applied Research)
Nakamoto LLC
Jun 2024 -- Present
Chicago, IL
  • Built a non-agentic LLM-based RAG pipeline for bug localization and patching using structure-aware retrieval, semantic ranking, and AST-guided CoSIL localization; achieved 52% accuracy on SWE-bench Verified, outperforming agent-based baselines.
  • Led the development of a RAG-based miner on Bittensor, collaborating with engineers on DevOps automation (Docker, Kubernetes, Prometheus); reduced downtime by 30% and generated ~$200K/month.
  • Developed modular fine-tuning pipelines for BigCodeBench with Qwen/DeepSeek; enabled 40% faster iteration across 100K+ examples.
  • Boosted symbolic math LLMs via adapter tuning and quantization; improved reasoning by 27% and reduced model size by 65%.
Machine Learning Engineer
Nakamoto LLC
Jan 2024 -- May 2024
Chicago, IL
  • Constructed early-stage fine-tuning pipelines using TRL + QLoRA; improved response quality by 42% under high load.
  • Implemented the multilingual language pipeline for Zangief subnet; achieved 92% accuracy across 20+ languages using semantic evaluation.
  • Orchestrated daily Apache Airflow DAGs to ingest, validate, and preprocess large-scale mathematical datasets and conversation logs for RAG fine-tuning workflows.
Machine Learning Engineer
NeuralMetrics.ai
Feb 2022 -- Aug 2022
Denver, CO
  • Created a classification engine using AWS Lambda, improving NAICS/SIC categorization accuracy by 25% for insurance industry applications.
  • Optimized XGBoost pipelines, reducing model latency by 30% and increasing prediction accuracy by 20% for real-time risk assessment.
  • Revamped risk scoring models by incorporating Bayesian methods, leading to a 20% reduction in false positives and improvement in model interpretability.

Technical Skills

Programming Languages: Python, TypeScript, JavaScript, Java, SQL
ML & LLMs: PyTorch, HuggingFace, Quantization, TTS/ASR, RAGAS, Vision Models (CLIP, BLIP, DINO), LangChain
Dev & API: FastAPI, Flask, React, Node.js, Celery, GraphQL
MLOps & Infra: MLflow, Prefect, Docker, Kubernetes, CI/CD
Data & Pipelines: Apache Airflow, Spark (PySpark), DVC, PostgreSQL, MongoDB
Certifications: AI Evals for Engineers (Maven, 2025; LLM evaluation, testing, and productionization), Mastering LLMs for Developers (Maven, 2024; fine-tuning, RAG, deployment)

Education

Illinois Institute Of Technology
Master of Computer Science
Aug 2022 -- May 2024
Chicago, IL
Panimalar Engineering College (Anna University)
Bachelor of Engineering in Electronics and Communications
Aug 2018 -- Jun 2022
Chennai, India

Projects

Triage.Flow – Agentic GitHub Assistant
[GitHub]
  • Built a production-ready, multi-index RAG assistant that lets developers query repos in natural language across code, issues, PRs, and docs.
  • Orchestrated composite retrieval over 6 indices with advanced caching, FAISS/BM25 ranking, and a FastAPI + WebSocket backend delivering sub-100 ms responses.
  • Created 15+ agent tools—patch linkage, semantic search, onboarding insights, code-evolution tracking—scaling to 1 M+ chunks and tripling triage throughput.
Weave – Synthetic Dataset Engine for LLMs
[GitHub]
  • Generated 300K+ synthetic samples with a transformer-compatible engine and configurable noising, boosting domain diversity by 30% and reducing hallucinations.
  • Built a merge pipeline for real + synthetic data, cutting preprocessing time 40% and ensuring sampling consistency.