Founded in 2015, Instabase's mission is to advance the state of the art by building tools that help people solve important problems, make discoveries, and create new breakthroughs.
Instabase is an operating system designed for operational efficiency. It uses the web browser as the user interface, a pluggable storage system for managing data (files and databases), and an app store for applications. These applications run on the Instabase Platform, which provides the core management capabilities for managing diverse, distributed datasets; support for collaboration and access control; and a runtime for Instabase Applications.
The applications include state of the art tools for  processing a diverse set of unstructured data (scanned images, PDFs, word/excel documents, emails, websites, etc.),  running extensible functions with a server-less framework,  machine learning (natural language processing, image processing, classification, clustering, etc.), and  data science.
Our customers include large enterprises with huge operational costs in a variety of domains, such as financial services (e.g. banks, insurance), healthcare, logistics/supply chain.
As a non-profit initiative, Instabase provides a hosted IPython-style notebook for education, which is widely used by universities (Stanford, MIT, Columbia, University of Chicago, etc.), for teaching classes. The Stanford Dean of Engineering Jennifer Widom used Instabase as the platform for her 2016-17 Instructional Odyssey, a year-long sabbatical in which she traveled the world offering free short courses in data and design.
Instabase takes documents of any form -- image, PDF, web, text -- and extracts information and insights that power the world's businesses. To make this possible, we need to build the world's best Optical Character Recognition (OCR) platform. We build and maintain several in-house OCR pipelines to target different types of real-world data, including clean text documents, noisy photographs, and handwriting.
As a Senior Deep Learning Engineer focusing on OCR and Image Processing, you will be responsible for maintaining and improving these OCR pipelines. Your work will give you an opportunity to work across teams: working with the customer team to prioritize improvements based on real world feedback, working with the infrastructure team to achieve Fortune 50-level scale, and working with our front-end team to craft end-user apps that best show off your work.
- Maintain and improve our text and handwriting OCR models
- Identify benchmark datasets relevant to different industries and use-cases and work towards production-quality performance on them
- Devise ways to extract information from images beyond just text: document structure and flow, font and styling information, logos and tables
- Fluent and creative with Deep Learning
- Experience with image processing (OpenCV, heuristic methods)
- Excellent with Python
- Meticulous at designing, scoping, and conducting data-driven experiments
- Enjoys challenging but rewarding algorithmic problems
Instabase is an equal opportunity employer and values diversity in all forms. Instabase does not discriminate on the basis of race, religion, color, national origin, gender identity, sexual orientation, age, marital status, protected veteran status, disability, or any other unlawful factor. Instabase also complies with local laws, including the San Francisco Fair Chance Ordinance.