Ashik Syed Shaffiullah

AI Engineer | Machine Learning Engineer

Mobile: +1-312-900-7691

Summary

Founding ML Engineer (2+ yrs) who ships production LLM & RAG systems—autonomous bug-fixing agent (32% SWE-Bench), $200 K/mo RAG miner, and AI tools that 3× code-triage throughput.

Experience

Founding Machine Learning Engineer (Applied Research)

Nakamoto LLC

Jun 2024 -- Present

Chicago, IL

Built a non-agentic LLM-based RAG pipeline for bug localization and patching using structure-aware retrieval, semantic ranking, and AST-guided CoSIL localization; achieved 52% accuracy on SWE-bench Verified, outperforming agent-based baselines.
Led the development of a RAG-based miner on Bittensor, collaborating with engineers on DevOps automation (Docker, Kubernetes, Prometheus); reduced downtime by 30% and generated ~$200K/month.
Developed modular fine-tuning pipelines for BigCodeBench with Qwen/DeepSeek; enabled 40% faster iteration across 100K+ examples.
Boosted symbolic math LLMs via adapter tuning and quantization; improved reasoning by 27% and reduced model size by 65%.

Machine Learning Engineer

Nakamoto LLC

Jan 2024 -- May 2024

Chicago, IL

Constructed early-stage fine-tuning pipelines using TRL + QLoRA; improved response quality by 42% under high load.
Implemented the multilingual language pipeline for Zangief subnet; achieved 92% accuracy across 20+ languages using semantic evaluation.
Orchestrated daily Apache Airflow DAGs to ingest, validate, and preprocess large-scale mathematical datasets and conversation logs for RAG fine-tuning workflows.

Machine Learning Engineer

NeuralMetrics.ai

Feb 2022 -- Aug 2022

Denver, CO

Created a classification engine using AWS Lambda, improving NAICS/SIC categorization accuracy by 25% for insurance industry applications.
Optimized XGBoost pipelines, reducing model latency by 30% and increasing prediction accuracy by 20% for real-time risk assessment.
Revamped risk scoring models by incorporating Bayesian methods, leading to a 20% reduction in false positives and improvement in model interpretability.

Technical Skills

Programming Languages: Python, TypeScript, JavaScript, Java, SQL

ML & LLMs: PyTorch, HuggingFace, Quantization, TTS/ASR, RAGAS, Vision Models (CLIP, BLIP, DINO), LangChain

Dev & API: FastAPI, Flask, React, Node.js, Celery, GraphQL

MLOps & Infra: MLflow, Prefect, Docker, Kubernetes, CI/CD

Data & Pipelines: Apache Airflow, Spark (PySpark), DVC, PostgreSQL, MongoDB

Certifications: AI Evals for Engineers (Maven, 2025; LLM evaluation, testing, and productionization), Mastering LLMs for Developers (Maven, 2024; fine-tuning, RAG, deployment)

Education

Illinois Institute Of Technology

Master of Computer Science

Aug 2022 -- May 2024

Chicago, IL

Panimalar Engineering College (Anna University)

Bachelor of Engineering in Electronics and Communications

Aug 2018 -- Jun 2022

Chennai, India

Projects

Triage.Flow – Agentic GitHub Assistant

[GitHub]

Built a production-ready, multi-index RAG assistant that lets developers query repos in natural language across code, issues, PRs, and docs.
Orchestrated composite retrieval over 6 indices with advanced caching, FAISS/BM25 ranking, and a FastAPI + WebSocket backend delivering sub-100 ms responses.
Created 15+ agent tools—patch linkage, semantic search, onboarding insights, code-evolution tracking—scaling to 1 M+ chunks and tripling triage throughput.

Weave – Synthetic Dataset Engine for LLMs

[GitHub]

Generated 300K+ synthetic samples with a transformer-compatible engine and configurable noising, boosting domain diversity by 30% and reducing hallucinations.
Built a merge pipeline for real + synthetic data, cutting preprocessing time 40% and ensuring sampling consistency.