C++ Engineer Ai Runtime
hace 7 días
**About Us**:
We are a **stealth-mode startup** building next-generation infrastructure for the AI industry. Our team has decades of experience in software, systems, and deep tech. We are working on a new kind of AI runtime that pushes the boundaries of performance and flexibility making advanced models portable, efficient, and customizable for real-world deployment.
If you want to be part of a small, fast-moving team shaping the **future of applied AI systems**, this is your opportunity.
**Role**:
We are looking for a **C++ Engineer** with strong systems and GPU programming background to help extend and optimize an open-source AI inference runtime. You will work on low-level internals of large language model serving, focusing on:
- Dynamic adapter integration (e.g., LoRA/QLoRA)
- Incremental model update mechanisms
- Multi-session inference caching and scheduling
- GPU performance improvements (Tensor Cores, CUDA/ROCm)
This is a **hands-on role**: you will be designing, coding, profiling, and iterating on high-performance inference code that runs directly on CPUs and GPUs.
**Responsibilities**:
- Implement support for **runtime adapter loading (LoRA)**, enabling models to be customized on the fly without retraining or model merges.
- Design and implement mechanisms for **incremental model deltas**, allowing models to be extended and updated efficiently.
- Extend runtime to handle **multi-session execution**, with isolation and caching strategies for concurrent users.
- Optimize core math kernels and memory layouts to improve inference performance on **CPU and GPU backends**.
- Collaborate with backend and infrastructure engineers to integrate your work into APIs and orchestration layers.
- Write benchmarks, unit tests, and profiling tools to ensure correctness and measure performance gains.
- Contribute to system architecture discussions and help define the roadmap for future runtime features.
**Requirements**:
- Strong proficiency in **modern C++ (C++14/17/20)** and systems programming.
- Solid understanding of **low-level performance optimization**: memory management, multithreading, SIMD, cache efficiency.
- Experience with **CUDA** and/or **ROCm/HIP** GPU programming.
- Familiarity with **linear algebra kernels** (matrix multiply, attention) and how they map to hardware acceleration (Tensor Cores, BLAS libraries, etc.).
- Exposure to **machine learning inference frameworks** (e.g., llama.cpp, TensorRT, ONNX Runtime, TVM, PyTorch internals) is a plus.
- Comfortable working in a **Unix/Linux** environment; experience with build systems (CMake, Bazel) and CI pipelines.
- Strong problem-solving and debugging skills; ability to dive deep into both code and performance traces.
- Self-motivated and able to thrive in a **fast-moving startup** environment.
**Nice to Have**:
- Experience implementing **LoRA or adapter-based fine-tuning** in inference runtimes.
- Knowledge of **quantization methods** and deploying quantized models efficiently.
- Background in distributed systems or multi-GPU orchestration.
- Contributions to **open-source ML/AI systems**.
**Why Join**:
- Build core IP at the intersection of **AI and systems engineering**.
- Work with a highly technical founding team on problems that are both intellectually challenging and commercially impactful.
- Opportunity to shape the direction of a new AI platform from the ground up
- Competitive compensation (contract or full-time), equity potential, and flexible remote work.
-
Backend Engineer
hace 7 días
Santiago de Chile Baasi A tiempo completo**About Us**: We are a **stealth-mode startup** building new infrastructure for the AI industry. Our mission is to make advanced language models deployable, customizable, and secure across diverse environments. Our platform leverages an existing SaaS codebase for authentication, billing, and user management, and we are extending it with AI-specific features...
-
Software Engineer, AI
hace 4 semanas
, , Chile G2i Inc. A tiempo completoSoftware Engineer, AI (C) Apply for the Software Engineer, AI (C) role at G2i Inc. (Remote) Role Overview Help train large‑language models (LLMs) to write production‑grade code across a wide range of programming languages: Compare & rank multiple code snippets, explaining which is best and why. Repair & refactor AI‑generated code for correctness,...
-
AI Software Engineer
hace 2 semanas
Santiago, Chile G2i Inc. A tiempo completoOverview Software Engineer, AI — Code Evaluation & Training (Remote) Help train large-language models (LLMs) to write production-grade code across a wide range of programming languages: Compare & rank multiple code snippets , explaining which is best and why. Repair & refactor AI-generated code for correctness, efficiency, and style. Inject feedback...
-
Remote C AI Software Engineer
hace 4 semanas
, , Chile G2i Inc. A tiempo completoA technology company is seeking a Software Engineer, AI (C), to help train large language models. In this fully remote role, you'll work with various programming languages and focus on refining code generation processes. The ideal candidate must have over 3 years of experience in C, strong code review skills, and excellent written communication abilities....
-
Generative Ai Engineer
hace 2 semanas
Santiago de Chile Munich TES A tiempo completo**Generative AI Engineer (F/M/X)** Munich TES is seeking talented individuals with a background in Generative AI, offering an exciting opportunity for graduates from Technical University Departments/Schools of Sci - ences/Computer Science, Physics, Mathematics, or related fields. Join our team to contribute to cutting-edge projects in Generative AI, working...
-
Ai Engineer
hace 1 semana
Santiago de Chile Launchpad Technologies Inc. A tiempo completo**AI Engineer - Agent-Oriented LLM Workflows**: We’re building the future of AI-driven agent workflows—and we want you to help lead the way. As an AI Engineer, you'll architect, deploy, and optimize advanced LLM-based agent systems that can interact, reason, and deliver business value at scale. You’ll collaborate closely with data engineers and...
-
AI Engineer
hace 3 semanas
, , Chile EPAM Systems A tiempo completoWe are seeking a self-directed AI Developer to deliver cutting-edge AI solutions in a fast-paced team focused on AI initiatives. You will work with large language models and modern AI frameworks to build scalable AI solutions. If you are passionate about AI technology and thrive in agile environments, apply now to contribute to impactful AI projects....
-
Remote AI Code Evaluator
hace 2 semanas
Santiago, Chile G2i Inc. A tiempo completoA tech company is seeking a remote Software Engineer to help train large-language models in code evaluation and training. The ideal candidate will have over 3 years of experience in C#, possess strong code-review instincts, and excellent communication skills. This role involves comparing code snippets and refactoring AI-generated code. Compensation ranges...
-
Senior C/C++ Code Reviewer for AI Data Training
hace 3 semanas
Santiago, Chile G2i Inc. A tiempo completoA tech company specializing in AI is hiring a Code Reviewer to ensure quality standards for AI-generated C/C++ code evaluations. Required qualifications include 5–7+ years in C/C++ development, strong debugging skills, and English proficiency. The role involves reviewing code, validating annotations, and providing feedback. Compensation is hourly and...
-
Senior Full-Stack Software Engineer
hace 3 semanas
, , Chile Quadrant Technologies A tiempo completoSenior Full-Stack Software Engineer (LATAM | Remote) We are seeking a Senior Full-Stack Engineer based in LATAM with strong English communication skills and deep experience across Microsoft/Azure technologies, modern web stacks, and secure software delivery. This role emphasizes AI-augmented development, exposure to agentic AI concepts, and effective...