Data Architecture Engineer


Doximity is rewiring healthcare and is the 6th fastest-growing technology company in North America. Here's how clinicians use our products. For us, transparency is key. Ensuring your goals and values align with ours is also an important step. Take a look at how we Work at Doximity.

You will join a small team of data infrastructure engineers (4) to build and maintain all aspects of our data pipelines, ETL processes, data warehousing, ingestion and overall data infrastructure. We have one of the richest healthcare datasets in the world, and we're not afraid to invest in all things data to enhance our ability to extract insight.

What You'll Work On

  • Help establish robust solutions for consolidating data from a variety of data sources.
  • Collaborate with product managers and data scientists to architect pipelines to support delivery of recommendations and insights from machine learning models.
  • Build and maintain efficient data integration, matching, and ingestion pipelines.
  • Establish data architecture processes and practices that can be scheduled, automated, replicated and serve as standards for other teams to leverage. 
  • Build instrumentation, alerting and error-recovery system for the entire data infrastructure.
  • Spearhead, plan and carry out the implementation of solutions while self-managing.
  • We expect you to be very comfortable around Unix, Git, and AWS.

Must Have Skills & Requirements

  • At least three years of professional experience developing data infrastructure solutions.
  • Fluency in Python and SQL.
  • Experience building data pipelines with Spark and Kafka.
  • Passion for clean code and testing with Pytest, FactoryBoy, or equivalent. 
  • Comprehensive experience with Unix, Git, and AWS tooling.
  • Astute ability to self-manage, prioritize, and deliver functional solutions.

Nice to Have Skills & Requirements

  • Experience with MySQL replication, binary logs, and log shipping.
  • Experience with additional technologies such as Hive, EMR, Presto or similar technologies.
  • Experience with MPP databases such as Redshift and working with both normalized and denormalized data models.
  • Knowledge of data design principles and experience using ETL frameworks such as Sqoop or equivalent. 
  • Experience designing, implementing and scheduling data pipelines on workflow tools like Airflow, or equivalent.
  • Experience working with Docker, PyCharm, Neo4j, Elasticsearch, or equivalent. 

Our Data Stack

  • Python, Kafka, Spark, MySQL, Redshift, Presto, Airflow, Neo4j, Elasticsearch

Fun Facts About the Team

  • We have access to one of the richest healthcare datasets in the world, with deep information on hundreds of thousands of healthcare professionals and their connections.
  • Business decisions at Doximity are driven by our data, analyses, and insights.
  • Hundreds of thousands of healthcare professionals will utilize the products you build.
  • Our R&D team makes up about half the company, and the product is led by the R&D team. 
  • Our Data Science team is comprised of about 20 people. 

Benefits & Perks

  • Comprehensive benefits including medical, vision, dental, Life/ADD, 401k, flex spending accounts, and commuter benefits
  • Stock, pre-IPO stock incentives
  • 3+ weeks of PTO
  • 12 company holidays, including company shut-down in December
  • Team trips to fun places like Lake Tahoe, Sonoma, Seattle and Park City
  • 5th-year sabbatical

About Doximity

Doximity is the leading social network for healthcare professionals with over 75% of U.S. doctors as members. We have strong revenues, profits, real market traction, and were putting a dent in the inefficiencies of our $2.5 trillion U.S. healthcare system. After the iPhone, Doximity is the fastest adopted product by doctors of all time. Our founder, Jeff Tangney, is the founder & former President and COO of Epocrates (IPO in 2010), and Nate Gross is the founder of digital health accelerator RockHealth. Our investors include top venture capital firms who've invested in Box, Salesforce, Skype, SpaceX, Tesla Motors, Twitter, Tumblr, Mulesoft, and Yammer. Our beautiful offices are located in SoMa San Francisco.

We are an equal opportunity employer and value diversity at our company. We do not discriminate by race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.