Microsoft
Cambridge Internship in ML Model Optimization
Job Description
Responsibilities
- Research and develop quantization flow for LLM inference and training
- Design, implement and evaluate performance of quantized SOTA LLMs
- Write and present your findings in technical documents or presentations
Qualifications
Required/Minimum Qualifications:
- Be enrolled in Masters/PhD program in Computer Science/Machine Learning or related discipline
- Substantial experience quantization of LLMs, model compression
- Substantial knowledge in low-precision data type such as floating point, integer formats, block floats
Other Requirements:
- Cloud Background Check
Preferred/Additional Qualifications:
- PyTorch, Python, Hands-on experience in SW Tool development
- Outstanding communication skills
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.