Principal Data Science-Catalog

Coupang 是全球最大也是发展速度最快的电商平台之一。其使命是让顾客无法想象“如果离开了Coupang该如何生活?”为了创造这个新世界,我们正在寻找满怀热情共同实现这一追求的人才。助力于世界尖端技术和运营体系,我们正致力于彻底改变顾客的点对点终端体验,从引进革命性的最后一公里配送,到潜心探索如何在一个以移动端为核心的平台上优化顾客的商品检索及发现。Coupang被《麻省理工科技评论》评选为“全球50家最智能科技公司”之一,并入选《福布斯》“30大改变世界格局者“。

 作为一家全球化企业,Coupang的办公地点分布在北京,洛杉矶,西雅图,首尔,上海和硅谷。

 

职位概览

我们的目标是为客户打造最佳的电子商务体验。我们从卖家那里获得了数百万种商品,希望通过自动检测目录中的功能,并使用结构化信息丰富目录,从而建立始终如一的体验。我们使用机器学习来开发模型,从文本中提取缺失的数据,检测不准确性并自动修复。我们努力建立高效的工作流程,在必要时才需要人为的判断。

我们每天都在解决从手机壳到服饰等各种商品类别的问题,消费目录,评论,观点等各种数据来源,不断提升目录。 而且我们所做的一切都是在快速增长的规模上进行的。

作为一名数据科学家,您将利用自己的知识构建算法,帮助我们自动理解文本(NLP /信息提取),以及强大的可扩展和可维护的机器学习模型。您将为解决问题提供科学严谨性,并为业务和工程团队提供关于目录总体战略的关键输入信息。您将与我们的顶级工程师合作,将您的解决方案应用到影响顾客购物方式的生产系统中。


岗位职责

  • 通过设计和测试新算法和技术从非结构化数据中提取商品数据,从而改进商品的发现。
  • 分析大量数据以发现模式,并构建健全的模型,以从各种来源(如商品目录、客户评论、点击等)中提取有价值的信息,这些信息在数据质量和结构上各不相同。
  • 自动精准的将产品分类为面向客户的类别。
  • 对品牌、尺寸或颜色等属性的变化(按语言和拼写)进行规范化。
  • 从数百万计的进口商品中识别出相同或相似的商品。
  • 识别网站上不允许的非法商品。
  • 定义对顾客体验至关重要的商品信息。
  • 您将使用这些算法和技术来改进顾客在网站上看到的内容,自动解决许多实例并识别需要目录专家输入的实例。
  • 您将帮助顾客查看高质量的商品数据,发现其他不可见的项目,帮助商家改善业务,从而改善客户体验。

任职资格:

  • 计算机科学(机器学习,数据挖掘,NLP,信息检索),统计学或相关领域的硕士学位
  • 2年以上的机器学习,数据挖掘,大数据方面经验
  • 良好的R/Python的实践知识
  • 有Spark/MapReduce/Hadoop等分布式框架的工作经验.
  • 出色的解决问题能力,并提供独具创意的解决方案
  • 能将非正式的业务问题分解为问题陈述并构建解决方案

 

优先考虑:

  • 计算机科学(机器学习,数据挖掘,NLP,信息检索),统计学或相关领域的博士学位
  • 在机器学习,数据挖掘或统计数据方面有丰富的实践经验,并有发表的跟踪记录
  • 能指导初级工程师和数据分析师

点击招聘官网,了解更多:

https://rocketyourcareer.cn.coupang.com/

Coupang is one of the largest and fastest growing e-commerce platforms on the planet. We are on a mission to revolutionize everyday lives for our customers, employees and partners. We solve problems no one has solved before to create a world where people ask, “How did we ever live without Coupang?” Coupang is a global company with offices in Beijing, Los Angeles, Seattle, Seoul, Shanghai, and Silicon Valley. 

 

Job Overview: 

As our Principal, Data Science for Catalog, you will be responsible for operational reporting and insights to make our consumer experience world-class.

Our goal is to build the best e-commerce experience for our customers. We get millions of products from sellers and we want to build a consistent experience by automatically detecting features from catalog, and enriching the catalog with structured information. We use machine learning to develop models to extract missing data from text, detect inaccuracies and fix them automatically. We strive to build efficient workflows allowing humans to apply their judgment only when necessary.

On a daily basis, we solve problems from different kinds product categories ranging from cell phone cases to fashion, consume various sources of data such as catalog, reviews, views etc. to continually enhance the catalog. And we do all of this at scale that is growing at a rapid pace.

As a data scientist you will use your knowledge to build algorithms that help us with automatic understanding of text (NLP/Information extraction), and robust scalable and maintainable machine learning models. You will bring scientific rigour to problem-solving and provide key inputs to business and engineering teams on overall strategy for catalog. You will work with our top engineers to put your solutions into production systems that impact how our customers shop.


Responsibilities:

  • Extract product data from unstructured data by designing and testing new algorithms and techniques, thereby improving discovery of products.
  • Analyze large amounts of data to discover patterns and build robust models to extract valuable information from various sources (e.g. product catalog, customer reviews, clicks etc.) that vary in quality of data and structure.
  • Automatically classify products into customer facing category with high accuracy.
  • Normalize variations (by language and spelling) for attributes like as brand, size or color.
  • Identify products that are identical or similar from millions of incoming selection of products.
  • Identify illegal products that are not allowed on the website.
  • Define product information that is important for customer experience.
  • You will put such algorithms and techniques to improve what customer sees on website, resolving many use cases automatically and identify cases that need inputs from catalog experts.
  • You will help improve customer experience on the website by enabling them to see high-quality data for products, discover items that are not otherwise visible and help merchants to improve their business.

 

Requirements:

  • Masters degree in Computer Science (Machine learning, data mining, NLP, information retrieval), Statistics or related field.
  • 2+ years of experience in machine learning, data mining, big data
  • Good working knowledge of R/Python
  • Experience with distributed frameworks like Spark/MapReduce/Hadoop.
  • Excellent problem-solving skills with out of box solutions.
  • Ability to decompose informal business problems into problem statements and build solutions.

 

Preferred:

  • Ph. D degree in Computer Science (Machine learning, data mining, information retrieval), Statistics or related field.
  • Proven practical experience in machine learning, data mining or statistics with track record of publications.
  • Desire to guide junior engineers and data scientists.
  • Strong verbal and written communication skills.

   

Apply for this Job

* Required

File   X
File   X