About the role
The way software gets built has changed. At PingAura, AI agents write a real share of our code, and humans make the calls that matter — what to ship, what to merge, what to retire. We are hiring an Applied AI Engineer to live at that boundary: dispatch agents, review pull requests, design evaluations, and own the long tail of production engineering that keeps the product reliable.
PingAura helps brands get discovered in AI search. Our flagship product is the AI Coworker — an agent that does the work of an AEO team across OpenAI, Gemini, and Anthropic. We are pre-seed and backed by 14 CXO angel investors, Google for Startups, and AWS. The product is live, customers are paying, and decisions move fast.
This is not a "junior who learns to code" role. It is for someone who already builds with Cursor, Claude Code, or Codex every day and is ready to operate at a higher level: orchestrating agents across worktrees, gating merges on evals, and owning the parts of production that touch every other team.
This is an on-site role at our Mumbai office. You will sit next to the founding team. You will hear the customer calls. You will push code on day one.
Responsibilities
- Dispatch and supervise multiple agents in parallel — Cursor, Claude Code, Codex — across worktrees on real product work
- Review agent-generated pull requests with sharp judgment. Catch subtle bugs, missing edge cases, and weak architectural decisions; push back fast when work is not ready
- Design evaluations and acceptance criteria so each agent run is faster, sharper, and more autonomous. Turn customer escalations into permanent regression tests
- Own the long tail of production engineering: cron jobs, internationalization, dependency upgrades, monitoring, schema housekeeping, and security patches
- Ship product features end to end when the work is judgment-heavy and should not be delegated
- Improve the internal tooling that makes the agent fleet faster: dispatch scripts, eval harnesses, prompt libraries, agent observability
- Join customer calls. Translate what you hear into evals, tests, and shipped code
You may be a good fit if
- You have shipped real projects with Cursor, Claude Code, Codex, or comparable agentic tooling — not toy demos, real things people use
- You can read a pull request and explain what is wrong with it in under five minutes
- You are more excited about orchestrating four agents in parallel than writing four functions by hand
- You treat evaluation as a first-class discipline and understand why eval pass rate matters more than test coverage
- You write clean TypeScript or Python and care about readability, testability, and small interfaces
- You have built at least one production feature where the core logic was a large language model call, and you know why "it works in the demo" is not the same as "it works in production"
- You have early-career experience (0–2 years of work) or are a final-year computer science student — we measure work shipped, not years served
- You live in Mumbai or can move here. This is an on-site role
Strong candidates may also have
- Production experience integrating large language model APIs with tool calling, structured outputs, streaming, and long-horizon workflows
- Familiarity with eval frameworks such as Langfuse, Braintrust, RAGAS, DeepEval, or OpenAI Evals
- Working knowledge of PostgreSQL, Redis, and Next.js Server Actions
- Exposure to GCP or AWS in production
- Open-source contributions to AI tooling, agent frameworks, or developer infrastructure
What we work with
- Language: TypeScript across the stack; Python for ML and eval tooling
- Web: Next.js 16 (App Router), React 19, Server Actions, Tailwind, Shadcn UI
- Database: PostgreSQL on Supabase with row-level security; pg_cron and pgmq for scheduled and queued work
- Cache and rate limiting: Redis on Memorystore — caching, distributed rate limiting, queue patterns
- AI: OpenAI, Gemini, and Anthropic via a multi-provider routing layer
- Observability: Langfuse for LLM traces, Sentry for errors, plus standard cloud logging and monitoring
- Cloud: GCP for compute, data, and storage; AWS for CDN
- Workflow: Turborepo monorepo, pnpm, Cursor and Claude Code daily
Compensation
- Competitive salary benchmarked for early-stage startups in India
Why this team
You operate at the new frontier of how software gets built. Most engineers are still learning to use AI tools well; you will help define what comes after that. The team is small, the customers are real, the founders are accessible, and the decisions are fast.