What you'll be doing:
Design and execute performance benchmarks using industry-standard tools (e.g., MLPerf, UCX, our Collective Communications Library – NCCL and CloudAI) and customer-representative AI workloads on our state-of-the-art GPU clusters.
Translate your benchmark data and technical insights into compelling, high-impact marketing assets and performance-driven sales enablement materials.
Collaborate closely with Product Management, ASIC and Software architecture and Sales teams, provide feedback on product features, and ensure our performance results are technically accurate and impactful.
What we need to see:
B.Sc in Computer Science or Software Engineering or equivalent experience
5+ years of experience benchmarking and analyzing high‑performance networking solutions, including RDMA, MPI, and large‑scale collective communication frameworks.
Hands‑on expertise in testing and benchmarking deep learning workloads on our GPUs with CUDA, TensorFlow, and PyTorch, focused on validating and demonstrating distributed training and inference performance over NCCL, RoCE, and RDMA.
Shown proficiency in Performance Analysis methodologies and techniques.
Understanding of Ethernet and high-performance networking.
Programming experience with Python, Bash and C languages.
Experience with distributed job orchestration (Slurm, Kubernetes).
Experience with Linux OS distros.
Fast and self-learning capabilities with strong analytical and problem-solving skills.
In-depth knowledge and experience with AI workloads and benchmarking for large-scale distributed training/inference systems.
Ways to stand out from the crowd:
Strong Performance Analysis skills and methodologies using modern tools.
Deep knowledge in AI/Data Center Ethernet networks protocols and best-practices (Clos fabrics, BGP, VXLAN, etc.).
Hands-on experience with automation, CI/CD pipelines and DevOps practices.
Expertise in AI fabrics telemetry including metrics capturing and analysis as well as telemetry tools such as Prometheus and Grafana.
In-depth System knowledge and understanding (Intel / AMD / ARM CPUs, NVIDIA GPUs, NIC, Memory, PCI).












