Dmitry Beresnev

Documentation

Featured

DoWell

2025

ML Developer, Tech Leader

An intelligent conversational system that uses Retrieval-Augmented Generation (RAG) to simulate expert consultations across various professional domains

Responsibilities

▸ Designed and implemented RAG architecture for domain-specific responses
▸ Deployed and connected generative models
▸ Engineered backend using FastAPI

Concepts

RAGLLMIndexing

Tech Stack

PyTorchHuggingFaceBeautifulSoupDockerFastAPI

Resources

EBREG-RL: Example-Based Regular Expression Generation via Reinforcement Learning

2025

Developer

A reinforcement learning system for automatic regular expression generation from labeled examples. The project formulates regex generation as a Markov Decision Process using Reverse Polish Notation to handle operator precedence

✨ Achievements

▸ Successfully generated optimal regex patterns for number and word extraction tasks
▸ Implemented novel reward functions combining F1 score, accuracy metrics, and length penalties

Responsibilities

▸ Formulated regex generation as MDP with 104-action space using RPN tokens
▸ Designed custom reward functions balancing pattern accuracy and expression complexity
▸ Implemented and compared REINFORCE and A2C algorithms

Concepts

Deep Reinforcement LearningPolicy Gradient MethodsNLP

Tech Stack

PythonPyTorch

Resources

PyFinder: fast search through Python documentation

2025

Developer

An Information Retrieval system providing fast search through Python's built-in documentation. The platform combines traditional inverted indexing with modern LLM-powered semantic search and RAG for natural language query processing, featuring content moderation and spell correction capabilities

✨ Achievements

▸ Performance with LLM embeddings + Ball Tree indexer: F1@1=0.53, nDCG@1=0.83

Responsibilities

▸ Implemented semantic search using sentence-transformers embeddings and Ball Tree spatial indexing
▸ Built RAG pipeline with prompt engineering, context retrieval, and source tracking
▸ Developed Norvig spell corrector with frequency-based language model
▸ Evaluated using comprehensive metrics: LLM-specific and ranking metrics
▸ Designed FastAPI backend and Next.js frontend with dual search/chat modes

Concepts

RAGSemantic SearchNLPLLM

Tech Stack

PythonPyTorchTransformersNLTKFastAPI

Resources

Detecting AI-generated Python code via ML

2025

Developer

A ML system for detecting AI-generated Python code in programming competitions. The project compares two approaches: transformer-based models (CodeBERT, DeBERTa) for deep semantic analysis and AST-based lightweight models (Random Forest, Decision Trees, MLP) for efficient structural pattern recognition

✨ Achievements

▸ Achieved 95.9% accuracy with CodeBERT model on synthetic dataset
▸ Developed efficient AST-based Random Forest achieving 83.5% accuracy with 2ms inference time

Responsibilities

▸ Engineered dataset generation pipeline using 4 LLMs (Evil, Llama-3.2-3b, BLACKBOX.AI, DeepSeek) with specialized prompts
▸ Fine-tuned DeBERTa-v3 and CodeBERT models
▸ Implemented AST-based feature extraction using Tree-sitter library for structural code analysis
▸ Integrated LIME explainability framework for model interpretation
▸ Evaluated models across 6 metrics: F1 Score, ROC/AUC, Precision, Recall, Accuracy, and inference time

Concepts

TreesTransformersASTEdTech

Tech Stack

PythonPyTorchTransformersTree-sitterscikit-learnLIME

Resources

RecSys via Approximate Matrix Factorization

2024

Developer

A RecSys built on approximate matrix factorization techniques for the synthetic dataset. The project explores multiple optimization approaches, such as gradient-based methods with various step-size strategies (Armijo, Wolfe conditions, Lipschitz estimation), advanced optimizers (Adam, RMSprop, AdaGrad, Heavy Ball, Nesterov), and vector-wise updates to solve the collaborative filtering problem

✨ Achievements

▸ Implemented 12+ optimization algorithms with 7 step-size selection strategies
▸ Compared full-matrix vs. row-wise/column-wise (Vector GD) update strategies

Responsibilities

▸ Formulated recommendation as matrix factorization problem
▸ Implemented 6 advanced optimizers: Adaptive GD, Heavy Ball, Nesterov momentum, AdaGrad, RMSprop, Adam, BFGS
▸ Experimented with Non-Negative Matrix Factorization using multiplicative updates
▸ Trained a neural network baseline with genre/demographic features

Concepts

Matrix FactorizationCollaborative FilteringRecSys

Tech Stack

PythonNumPyscikit-learnPyTorch

Resources

Featured

Accept School

2023 — Present

Founder, CEO; previously — Leader Developer

A comprehensive EdTech platform that combines machine learning with modern web technologies to provide an interactive learning experience for programming students

✨ Achievements

▸ Currently utilized in educational organizations
▸ Approximately 200 active users

Responsibilities

▸ Led full-stack solution design
▸ Defined development and operational processes
▸ Developed code plagiarism detection system using ML
▸ Implemented generative AI for hint suggestions, text and images generation using open-source LLMs
▸ Engineered backend with FastAPI and MongoDB
▸ Built frontend with Next.js

Concepts

EdTechGenerative AIML

Tech Stack

PyTorchFastAPINext.jsMongoDBDockerApache Kafka

Resources

About

Documentation

EBREG-RL: Example-Based Regular Expression Generation via Reinforcement Learning

2025

Developer

A reinforcement learning system for automatic regular expression generation from labeled examples. The project formulates regex generation as a Markov Decision Process using Reverse Polish Notation to handle operator precedence

✨ Achievements

▸ Successfully generated optimal regex patterns for number and word extraction tasks
▸ Implemented novel reward functions combining F1 score, accuracy metrics, and length penalties

Responsibilities

▸ Formulated regex generation as MDP with 104-action space using RPN tokens
▸ Designed custom reward functions balancing pattern accuracy and expression complexity
▸ Implemented and compared REINFORCE and A2C algorithms

Concepts

Deep Reinforcement LearningPolicy Gradient MethodsNLP

Tech Stack

PythonPyTorch

Resources

Detecting AI-generated Python code via ML

2025

Developer

A ML system for detecting AI-generated Python code in programming competitions. The project compares two approaches: transformer-based models (CodeBERT, DeBERTa) for deep semantic analysis and AST-based lightweight models (Random Forest, Decision Trees, MLP) for efficient structural pattern recognition

✨ Achievements

▸ Achieved 95.9% accuracy with CodeBERT model on synthetic dataset
▸ Developed efficient AST-based Random Forest achieving 83.5% accuracy with 2ms inference time

Responsibilities

▸ Engineered dataset generation pipeline using 4 LLMs (Evil, Llama-3.2-3b, BLACKBOX.AI, DeepSeek) with specialized prompts
▸ Fine-tuned DeBERTa-v3 and CodeBERT models
▸ Implemented AST-based feature extraction using Tree-sitter library for structural code analysis
▸ Integrated LIME explainability framework for model interpretation
▸ Evaluated models across 6 metrics: F1 Score, ROC/AUC, Precision, Recall, Accuracy, and inference time

Concepts

TreesTransformersASTEdTech

Tech Stack

PythonPyTorchTransformersTree-sitterscikit-learnLIME

Resources

Featured

DoWell

2025

ML Developer, Tech Leader

An intelligent conversational system that uses Retrieval-Augmented Generation (RAG) to simulate expert consultations across various professional domains

Responsibilities

▸ Designed and implemented RAG architecture for domain-specific responses
▸ Deployed and connected generative models
▸ Engineered backend using FastAPI

Concepts

RAGLLMIndexing

Tech Stack

PyTorchHuggingFaceBeautifulSoupDockerFastAPI

Resources

PyFinder: fast search through Python documentation

2025

Developer

An Information Retrieval system providing fast search through Python's built-in documentation. The platform combines traditional inverted indexing with modern LLM-powered semantic search and RAG for natural language query processing, featuring content moderation and spell correction capabilities

✨ Achievements

▸ Performance with LLM embeddings + Ball Tree indexer: F1@1=0.53, nDCG@1=0.83

Responsibilities

▸ Implemented semantic search using sentence-transformers embeddings and Ball Tree spatial indexing
▸ Built RAG pipeline with prompt engineering, context retrieval, and source tracking
▸ Developed Norvig spell corrector with frequency-based language model
▸ Evaluated using comprehensive metrics: LLM-specific and ranking metrics
▸ Designed FastAPI backend and Next.js frontend with dual search/chat modes

Concepts

RAGSemantic SearchNLPLLM

Tech Stack

PythonPyTorchTransformersNLTKFastAPI

Resources

RecSys via Approximate Matrix Factorization

2024

Developer

A RecSys built on approximate matrix factorization techniques for the synthetic dataset. The project explores multiple optimization approaches, such as gradient-based methods with various step-size strategies (Armijo, Wolfe conditions, Lipschitz estimation), advanced optimizers (Adam, RMSprop, AdaGrad, Heavy Ball, Nesterov), and vector-wise updates to solve the collaborative filtering problem

✨ Achievements

▸ Implemented 12+ optimization algorithms with 7 step-size selection strategies
▸ Compared full-matrix vs. row-wise/column-wise (Vector GD) update strategies

Responsibilities

▸ Formulated recommendation as matrix factorization problem
▸ Implemented 6 advanced optimizers: Adaptive GD, Heavy Ball, Nesterov momentum, AdaGrad, RMSprop, Adam, BFGS
▸ Experimented with Non-Negative Matrix Factorization using multiplicative updates
▸ Trained a neural network baseline with genre/demographic features

Concepts

Matrix FactorizationCollaborative FilteringRecSys

Tech Stack

PythonNumPyscikit-learnPyTorch

Resources

Collaborations

Paradise Crane

Founder, previously — Leader Developer

2023 — Present

Website

Paradise Crane

Founder, previously — Leader Developer

2023 — Present

Website

A collaborative team of developers revolutionizing educational technology through innovative EdTech solutions. The organization combines machine learning with modern web technologies to create accessible and functional learning experiences for students and educators. The main current project — Accept educational platform

Featured projects

Accept School

A comprehensive EdTech platform combining ML with modern web technologies for interactive programming education, featuring code plagiarism detection, generative AI for hints, and automated assessment systems

PyTorchFastAPINext.jsMongoDBDocker

Accept Documentation

Documentation

Rich documentation of Accept platform for educators and students, also containing the AI-features usage examples

Astro

Accept Marketing Landing

Marketing website showcasing the Accept platform's features, benefits, and educational impact for prospective users and educational institution

Next.jsTypeScript

Crogs Foundation

Founder, Research Collaborator

2025 — Present

Crogs Foundation

Founder, Research Collaborator

2025 — Present

A community of enthusiastic researchers and developers dedicated to advancing the frontiers of technology through curiosity-driven research and practical applications. The foundation bridges cutting-edge research with user-centric implementations, focusing on AI-powered code evolution and intelligent automation systems.

Featured projects

DoWell