Lead QA Validation Engineer - Server & Storage Systems (NVMe, SATA, SSD, HDD) Onsite Job at Confidential Company, Richardson, TX

S3Bha2dYSWc0bHVmMHhGaGphUVhLLzY2TGc9PQ==
  • Confidential Company
  • Richardson, TX

Job Description

Lead Test Engineer - Ai Servers & Storage (FW, NVMe, SATA, SSDs, HDDs, DIMMS)

Fulltime position – Must work onsite 4 days a week in Richardson, TX

CONFIDENTIAL : Publicly traded computer hardware infrastructure platform solutions company with over $3 Billion in sales whose stock price has grown over 200% in the last year because their products and services are used within Ai Data Centers .

Must have great interpersonal skills and experience TESTING enterprise-level Storage & Server infrastructure systems for AI Data Centers

The Senior Lead Test (& Validation) Engineer - Storage & Server Infrastructure Systems will play a pivotal role in the design, development, and execution of comprehensive test strategies for AI data center's storage and server infrastructure . (HW + FM + SW).

This leadership position requires deep expertise in enterprise storage systems, server architectures, networking, and a strong understanding of the unique performance and reliability demands of AI/ML workloads . The ideal candidate will be a hands-on technical leader.

Responsibilities:

  • Define, develop, and implement comprehensive test plans and strategies for all storage and server hardware, firmware, and software components within the AI Data Center environment.
  • Lead the Test team in designing, executing, and analyzing complex test cases, including functional, performance, reliability, stress, and endurance testing.
  • Design and implement automated test frameworks and scripts using languages like Python, Go, or similar, to improve efficiency and coverage of testing.
  • Conduct in-depth performance analysis and bottleneck identification for storage systems (e.g., NVMe, SSD, HDD arrays, distributed storage, SAN/NAS) and server platforms (e.g., CPU, GPU, memory, PCIe, networking), and OpenBMC interfaces/features.
  • Debug issues related to BMC functionality and its interaction with server hardware.
  • Develop and maintain robust testbeds and infrastructure for continuous integration and validation.
  • Utilize open-source and commercial test tools relevant to storage, server, and OpenBMC validation.
  • Collaborate closely with hardware design, software development, infrastructure, and AI/ML engineering teams to understand requirements and integrate testing throughout the product lifecycle.
  • Communicate test progress, results, and critical issues effectively to stakeholders, including executive leadership.
  • Develop specialized test methodologies to validate performance and reliability under heavy AI/ML workloads (e.g., large model training, inference at scale, data ingestion).
  • Understand and test the interactions between GPU -accelerated computing, high-speed networking, and storage systems.

REQUIREMENTS

  • BS with 8+ years of hands-on hardware VALIDATION and platform TEST engineering experience with direct exposure to AI Data Center Server & Storage components including NVMe, SATA, SSDs, HDDs, DIMMS, and system-level platforms used in large-scale cloud environments.
  • Need someone that is firmly rooted in HARDWARE and FIRMWARE Validation.
  • Must have 2+ years of experience in a LEAD or senior technical role, leading test initiatives, assigning and guiding junior test engineers.
  • Must be very Hands-On with NVMe, SATA, SSDs, HDDs, DIMMS .
  • Great interpersonal skills & English Communication skills , with the ability to collaborate effectively across diverse teams and with vendors and customers.
  • Strong in Debugging server Hardware ( BMC, PCIe , networking).
  • Strong in AI/ML workload optimization ( TensorFlow, PyTorch ) and their infrastructure requirements.
  • Strong Linux and Python/GO Automation, and Strong Perf analysis of storage/server platforms.
  • Familiarity with OCP (Open Compute Project).
  • Certifications in relevant technologies (e.g., NetApp, Dell EMC, HPE, NVIDIA ). Distributed Storage validation.
  • Contribute to platform Firmware validation testing, BIOS bring up.
  • Must work onsite 4 days a week in Richardson, TX.

Job Tags

Full time,

Similar Jobs

GKN Automotive

Facilities Electrical Engineer Job at GKN Automotive

Key Responsibilities Evaluate facility electrical systems, products, components, and controls through research and testing programs. Confirm system and component capabilities by designing testing methods and analyzing properties. Lead and support the Facility ...

One City Schools, Inc.

Middle School Math Teacher Job at One City Schools, Inc.

 ...Middle School Math Teacher One City Schools | Madison, Wisconsin Full-Time | School-Year Position Salary Range: $50k - $80k depending on qualifications and experience Reports to: Middle School Principal Posted: December 2025 Application Deadline: Open... 

ReeceNichols Real Estate

Assistant Sales Manager Job at ReeceNichols Real Estate

Assist in managing a branch real estate sales office. Recruit, develop, direct, train and maintain an effective sales and support staff capable of meeting objectives for profitability and growth. Work closely with Sales Manager to contribute to the development and validation...

Concrete Careers

Director of Preconstruction Job at Concrete Careers

Director of Preconstruction Location: Cartersville, GA We are seeking a dynamic Preconstruction Leader to lead and grow our preconstruction department. What Youll Do Serve as the face of the Preconstruction department , building and maintaining ...

College Hunks Hauling Junk & Moving

🚀 Sales Development Representative (SDR) Job at College Hunks Hauling Junk & Moving

Are you a confident communicator who thrives on helping others and closing deals? Do you want a role where your hustle is rewarded, your work has purpose, and your future is bright? Join our top-performing inside sales team at College HUNKS Hauling Junk & Moving , where...