We’re building AI-driven SaaS platforms that power intelligent automation for SMBs using LLMs, AI agents, and data-intensive pipelines.
You’ll work in startup mode, helping scale AI systems from early R&D to secure, fault-tolerant production infrastructure.
Your Role
You will own the cloud infrastructure, deployment pipelines, and reliability of AI-powered systems.
Your mission is to ensure scalability, security, and cost-efficient operation of AI workloads in production.
You’ll work closely with backend and AI engineers to turn experimental AI features into robust, observable, and resilient systems.
YOUR STACK:
- Design and build Python backend services for AI-driven products- Build async, scalable APIs for AI inference and orchestration- Move AI prototypes into production-grade systems.
- Ensure observability, performance, and cost efficiency of AI workloads.
- Design data pipelines for embeddings, vector search, and context retrieval.
- Implement LLM-powered features (agents, tools, workflows, RAG).
Cloud & Infrastructure
- AWS (VPC, EC2, ECS/EKS, RDS, S3, IAM, CloudWatch)-
- Auto-scaling, load balancing, multi-AZ setups
- Terraform (IaC, modules, environments)
CI/CD & Runtime
- Docker
- CI/CD pipelines (GitHub Actions / GitLab CI / similar)
- Blue/green & rolling deployments
Security
- IAM, least-privilege access
- Secrets management
- Network security (VPC, security groups, private subnets)
- Compliance-aware setups (SOC2 / basic security best practices)
Reliability & Observability
- Monitoring, logging, tracing
- SLOs, alerting, failure detection
- Backup & disaster recovery strategies
AI & Data Considerations
- Supporting LLM inference workloads (latency, throughput, cost)
- GPU / high-CPU workloads (nice to have)
- Caching and batching strategies for AI APIs
- Safe rollout of AI features with controlled blast radius
NICE-TO-HAVE:
- Experience with high-load or data-intensive SaaS systems
- Knowledge of AI infrastructure patterns (model serving, vector DB hosting)
- Security audits or compliance exposure
- Cost optimization for AI & data pipelines
- Kubernetes (EKS) production experience
RESPONSIBILITIES:
- Design and maintain AWS infrastructure for AI-driven SaaS platforms- Build secure, scalable, fault-tolerant environments- Ensure high availability, disaster recovery, and smooth scaling- Optimize cloud cost for AI-heavy workloads
- Own observability, alerting, and incident response
- Design deployment strategies for AI services and inference workloads
- Implement infrastructure as code using Terraform
OFFER:
- Remote work;
- Open management without bureaucracy;
- Salary reviews according to the results of performance appraisal;
- 10 days paid sick leave and 18 working days’ vacation;
- Days off on National/Bank Holidays according to the legislation of Ukraine;
- Real AI in production — not just demos;
- Strong R&D culture with ownership;
- Opportunity to shape AI backend architecture from day one;
- Flexible remote work;
- Direct impact & fast feedback loops.