Hi, I'm

Dmitry Beresnev

ML Engineer & Data Scientist

Dmitry Beresnev
Available
MSc student specializing in AI and Data Science with deep research interest in Mathematical Optimization, LLMs and Deep Reinforcement Learning. Building state-of-the-art ML solutions.
AI ResearchData ScienceML OptimizationLLMDeep Learning

About Me

I'm a Master's student in Computer Science at Innopolis University, specializing in AI and Data Science with a deep research focus on ML Optimization, LLMs and Deep Learning.

My expertise spans developing and implementing novel ML models and functional pipelines using PyTorch. I have hands-on experience with classical DRL algorithms including DQN, A2C, and REINFORCE.

I also have experience in leading projects from conception to deployment, including development an AI-featured EdTech platform used by educational organizations.

Currently seeking challenging and interesting R&D positions where I can contribute to state-of-the-art ML and AI solutions.

Research Focus

Deep Learning, LLMs and Optimization

Technical Skills

PyTorch, PuLP, Scikit-learn, TRL and modern ML frameworks

Education

Master's student in AI & Data Science

Projects

Full-stack ML systems from research to production

Languages

Russian Native
English C1 (Advanced)

Education

MSc in Computer Science

Innopolis University

Innopolis, Russia

2024 — 2026

Field of study: AI & Data Science

Thesis

[in progress] New and Efficient Facet-Based Identification methods for Rank-Deficient Simplex-Structured Matrix Factorization

Supervisor: Valentin Leplat

Co-supervisors:

  • Nicolas Gillis

Relevant Coursework

High Dimensional Data Analysis Advanced Statistics Advanced Machine Learning

BSc in Computer Science

Innopolis University

Innopolis, Russia

4.95/5.0 GPA
2020 — 2024

Field of study: AI & Data Science

Thesis

Text plagiarism detection in the field of large language models using the reinforcement learning

Supervisor: Armen Beklaryan

Relevant Coursework

Optimization Methods in Machine Learning Reinforcement Learning Natural Language Processing Practical Deep Learning

Research Experience

Huawei: Wireless Data Transmission

Researcher, ML Engineer

ISP RAS & Innopolis University

2024 — Present

Designing and simulating Deep AI models for wireless distribution of devices to base stations under time and resource constraints for Huawei

Supervisor: Aleksandr Beznosikov

Responsibilities

  • Development and implementation of models on PyTorch
  • Creation and expansion of the training-testing pipeline
  • Conducting experiments

Concepts

Transformers GNNs Deep Learning

Tech Stack

PyTorchNumpy

Diligent Learning: Prospects and Applications

Researcher, ML Engineer

MSU AI Center

2025 — Present

Implementing and testing a Diligent Learning: a novel approach for fine-tunning the LLMs for reasoning problems. Based on paper 'From Reasoning to Super-Intelligence: A Search-Theoretic Perspective' by Shai Shalev-Shwartz and Amnon Shashua

Supervisor: Petr Anokhin

Responsibilities

  • Development and implementation of diligent learning pipeline
  • Fine-tunning LLMs in new paradigm
  • Conducting experiments

Concepts

LLMs Reasoning Deep Learning

Tech Stack

PythonTRLTensorBoard

New and Efficient Facet-Based Identification methods for Rank-Deficient Simplex-Structured Matrix Factorization

Researcher

Innopolis University

2025 — Present

Master thesis research on new methods of facet identifications for SSMF in order to improve existing GFPI algorithm

Supervisor: Valentin Leplat

Co-supervisors:

  • Nicolas Gillis

Responsibilities

  • Development and implementation of new polytope facet identification approaches
  • Reviewing SOTA SSMF methods and facet identification methods
  • Conducting experiments

Concepts

Constrained Optimization Linear Programming

Tech Stack

PuLPNumpyCVXPYScikit-Learn

Applied AlphaEvolve: CAD Reconstruction and Combinatorial Geometry

Researcher

Skoltech Summer School of Machine Learning (SMILES-2025)

2025

Research project applying OpenEvolve (open-source AlphaEvolve) to CAD reconstruction from text descriptions and combinatorial geometry problems using LLM-driven evolutionary search

Supervisor: Petr Anokhin

Achievements

  • Achieved optimal ball partition results matching theoretical bounds in dimensions 2-13
  • Outperformed zero-shot LLM baselines across multiple complex 3D shapes
  • Established comprehensive benchmark pipeline for CAD reconstruction with 7 evaluation metrics

Responsibilities

  • Implemented OpenEvolve framework for CAD reconstruction task
  • Designed evaluation metrics including IoU, Chamfer Distance, and Hausdorff Distance
  • Analyzed evolutionary pathways for structural and parametric error correction

Concepts

LLMs Evolutionary Algorithms CAD Reconstruction Multi-Agent Systems

Tech Stack

PythonCadQueryNumPyTrimeshSciPy

Text Plagiarism Detection in the filed of LLMs Using RL

Researcher

Innopolis University

2024

Bachelor thesis research on novel approach for text plagiarism detection using Deep Reinforcement Learning

Supervisor: Armen Beklaryan

Achievements

  • Best MSE of 0.108 on synthetic dataset
  • Proposed three architectures based on DQN, A2C, and REINFORCE
  • Best results achieved by REINFORCE model

Responsibilities

  • Designed novel DRL-based approach
  • Implemented and tested multiple architectures
  • Conducted and analyzed comprehensive experiments

Concepts

Reinforcement Learning LLM Deep Learning

Tech Stack

PyTorchNumpyPandas

Resources

Work Experience

ML Developer

Innopolis CIPR

2025

Design and implementation of RAG pipeline over proprietary Angular frontend repositories

Achievements

  • Approved quality of RAG pipeline on gold queries provided by experts

Responsibilities

  • Building indexers: Inverse Index, BallTree with model-generated embeddings and partially Faiss
  • Connecting local generative models
  • Designing pipeline of scrapping, embeddings generation, indexing and retrieving

Concepts

RAG LLM Indexing

Tech Stack

PyTorchHuggingFaceDockerPostgreSQLFastAPI

ML Engineer

Gazprom CPS

2024

Design and training predictive ML model to identify causes of defects in construction facilities

Achievements

  • Achieving 80% accuracy on proprietary dataset

Responsibilities

  • Data preprocessing and feature engineering
  • Model building and validation
  • Full working pipeline development

Concepts

Tree-based models MLP-based models Ensemble methods Transformers

Tech Stack

PyTorchNumpyScikit-learnPandas

ML Developer

Advanced Engineering School IU

2023

Development of code generation model using transformer-based architecture

Achievements

  • Significant contribution to research

Responsibilities

  • Fine-tuning Gorilla model on proprietary dataset

Concepts

LLMs Transformers LoRa

Tech Stack

PyTorchNumpyPandas

Resources

Teaching Experience

Teaching Assistant

Innopolis University

2025

Teaching assistant for Introduction to Optimization course for 2nd year bachelor students

Responsibilities

  • Conducting tests and laboratory work

Concepts

Optimization Linear Programming

Tech Stack

NumpyCVXPYPyTorch

Teaching Assistant

Yandex Student Camp on Math in AI

2024

Teaching assistant in student camp for Optimization Methods in Machine Learning course

Supervisor: Alexander Beznosikov

Responsibilities

  • Design and implementation of materials for seminars and homeworks

Concepts

Optimization Linear Programming

Tech Stack

PyTorchNumpyJAXCVXPY

Resources

Projects

Featured

Accept School

2023 — Present

Founder, CEO; previously — Leader Developer

A comprehensive EdTech platform that combines machine learning with modern web technologies to provide an interactive learning experience for programming students

✨ Achievements

  • Currently utilized in educational organizations
  • Approximately 200 active users

Responsibilities

  • Led full-stack solution design
  • Defined development and operational processes
  • Developed code plagiarism detection system using ML
  • Implemented generative AI for hint suggestions, text and images generation using open-source LLMs
  • Engineered backend with FastAPI and MongoDB
  • Built frontend with Next.js

Concepts

EdTechGenerative AIML

Tech Stack

PyTorchFastAPINext.jsMongoDBDockerApache Kafka
Featured

DoWell

2025

ML Developer, Tech Leader

An intelligent conversational system that uses Retrieval-Augmented Generation (RAG) to simulate expert consultations across various professional domains

Responsibilities

  • Designed and implemented RAG architecture for domain-specific responses
  • Deployed and connected generative models
  • Engineered backend using FastAPI

Concepts

RAGLLMIndexing

Tech Stack

PyTorchHuggingFaceBeautifulSoupDockerFastAPI

Resources

EBREG-RL: Example-Based Regular Expression Generation via Reinforcement Learning

2025

Developer

A reinforcement learning system for automatic regular expression generation from labeled examples. The project formulates regex generation as a Markov Decision Process using Reverse Polish Notation to handle operator precedence

✨ Achievements

  • Successfully generated optimal regex patterns for number and word extraction tasks
  • Implemented novel reward functions combining F1 score, accuracy metrics, and length penalties

Responsibilities

  • Formulated regex generation as MDP with 104-action space using RPN tokens
  • Designed custom reward functions balancing pattern accuracy and expression complexity
  • Implemented and compared REINFORCE and A2C algorithms

Concepts

Deep Reinforcement LearningPolicy Gradient MethodsNLP

Tech Stack

PythonPyTorch

Resources

PyFinder: fast search through Python documentation

2025

Developer

An Information Retrieval system providing fast search through Python's built-in documentation. The platform combines traditional inverted indexing with modern LLM-powered semantic search and RAG for natural language query processing, featuring content moderation and spell correction capabilities

✨ Achievements

  • Performance with LLM embeddings + Ball Tree indexer: F1@1=0.53, nDCG@1=0.83

Responsibilities

  • Implemented semantic search using sentence-transformers embeddings and Ball Tree spatial indexing
  • Built RAG pipeline with prompt engineering, context retrieval, and source tracking
  • Developed Norvig spell corrector with frequency-based language model
  • Evaluated using comprehensive metrics: LLM-specific and ranking metrics
  • Designed FastAPI backend and Next.js frontend with dual search/chat modes

Concepts

RAGSemantic SearchNLPLLM

Tech Stack

PythonPyTorchTransformersNLTKFastAPI

Resources

Detecting AI-generated Python code via ML

2025

Developer

A ML system for detecting AI-generated Python code in programming competitions. The project compares two approaches: transformer-based models (CodeBERT, DeBERTa) for deep semantic analysis and AST-based lightweight models (Random Forest, Decision Trees, MLP) for efficient structural pattern recognition

✨ Achievements

  • Achieved 95.9% accuracy with CodeBERT model on synthetic dataset
  • Developed efficient AST-based Random Forest achieving 83.5% accuracy with 2ms inference time

Responsibilities

  • Engineered dataset generation pipeline using 4 LLMs (Evil, Llama-3.2-3b, BLACKBOX.AI, DeepSeek) with specialized prompts
  • Fine-tuned DeBERTa-v3 and CodeBERT models
  • Implemented AST-based feature extraction using Tree-sitter library for structural code analysis
  • Integrated LIME explainability framework for model interpretation
  • Evaluated models across 6 metrics: F1 Score, ROC/AUC, Precision, Recall, Accuracy, and inference time

Concepts

TreesTransformersASTEdTech

Tech Stack

PythonPyTorchTransformersTree-sitterscikit-learnLIME

Resources

RecSys via Approximate Matrix Factorization

2024

Developer

A RecSys built on approximate matrix factorization techniques for the synthetic dataset. The project explores multiple optimization approaches, such as gradient-based methods with various step-size strategies (Armijo, Wolfe conditions, Lipschitz estimation), advanced optimizers (Adam, RMSprop, AdaGrad, Heavy Ball, Nesterov), and vector-wise updates to solve the collaborative filtering problem

✨ Achievements

  • Implemented 12+ optimization algorithms with 7 step-size selection strategies
  • Compared full-matrix vs. row-wise/column-wise (Vector GD) update strategies

Responsibilities

  • Formulated recommendation as matrix factorization problem
  • Implemented 6 advanced optimizers: Adaptive GD, Heavy Ball, Nesterov momentum, AdaGrad, RMSprop, Adam, BFGS
  • Experimented with Non-Negative Matrix Factorization using multiplicative updates
  • Trained a neural network baseline with genre/demographic features

Concepts

Matrix FactorizationCollaborative FilteringRecSys

Tech Stack

PythonNumPyscikit-learnPyTorch

Resources

Collaborations

Paradise Crane logo

Paradise Crane

Founder, previously — Leader Developer

2023 — Present

A collaborative team of developers revolutionizing educational technology through innovative EdTech solutions. The organization combines machine learning with modern web technologies to create accessible and functional learning experiences for students and educators. The main current project — Accept educational platform

Featured projects

Accept School

A comprehensive EdTech platform combining ML with modern web technologies for interactive programming education, featuring code plagiarism detection, generative AI for hints, and automated assessment systems

PyTorchFastAPINext.jsMongoDBDocker
Accept Documentation

Rich documentation of Accept platform for educators and students, also containing the AI-features usage examples

Astro
Accept Marketing Landing

Marketing website showcasing the Accept platform's features, benefits, and educational impact for prospective users and educational institution

Next.jsTypeScript
Crogs Foundation logo

Crogs Foundation

Founder, Research Collaborator

2025 — Present

A community of enthusiastic researchers and developers dedicated to advancing the frontiers of technology through curiosity-driven research and practical applications. The foundation bridges cutting-edge research with user-centric implementations, focusing on AI-powered code evolution and intelligent automation systems.

Featured projects

DoWell

An intelligent conversational system that uses Retrieval-Augmented Generation (RAG) to simulate expert consultations across various professional domains

PythonHuggingFaceRAGLLM
Applied AlphaEvolve (SMILES'25)

Research project applying OpenEvolve (open-source AlphaEvolve) to CAD reconstruction from text descriptions and combinatorial geometry problems using LLM-driven evolutionary search

PythonLLMsEvolutionary AlgorithmsCadQuerySciPy
Crogs Bot

Multi-purpose Telegram bot providing intelligent automation and interactive features for news scrapping, translations and user entertainment through natural language processing

PythonTelegram Bot APINLP

Get In Touch

I'm currently seeking interesting R&D opportunities in Machine Learning and AI. Feel free to reach out!

GitHub

@dsomni

Telegram

@dsomni