We make the world a safer place by applying behavioral and computer science in big data environments to identify illicit behaviors, actors, and networks. We build software to help our customers effectively and efficiently reduce crime, fraud, coercion, and violence.
Giant Oak seeks students to serve as a part-time Named Entity Recognition (NLP/ML) Intern.
You are an aspiring data scientist, machine learning engineer, or researcher in a related area. You are interested in expanding the ideas of what’s possible with data, and you aren’t afraid to get your hands dirty in feature engineering or unwieldy model training. You are very interested in the intersection of natural language processing (NLP) and machine learning (ML), and want to use these tools to have a direct impact on the success of Giant Oak Search Technology (GOST) and to help make the world a better place.
Apply machine learning techniques in NLP for the classification and clustering of open-source and publicly available web text.
Research new techniques for Named Entity Recognition (NER) in current academic literature and evaluate their potential.
Identify opportunities where the addition of machine learning and statistical modeling will improve or automate existing processes in GOST.
Define evaluation criteria for existing and newly created models.
Experience with statistical modeling and feature engineering or educational background in Machine Learning, Natural Language Processing, Digital Signal Processing, or related fields.
Experience with experimental design and the logistics of data mining
Self-motivated and willing to handle competing priorities in a fast-paced environment
Must be eligible for a U.S. Security Clearance
Preferred Requirements & Experience:
Direct experience with Natural Language Processing and Named Entity Recognition
Experience with programming in Python 3, R, or MATLAB
Experience with scikit-learn, Tensorflow, Keras, PyTorch
Familiarity with the mathematics of machine learning and data science on a graduate level