Bachelor’s degree in analytics engineering math computer science information technology or related discipline
8+ years professional experience in the big data space
8+ years' experience in engineering data pipelines using big data technologies (Spark Flink etc...) on large scale data sets
Expert knowledge in writing complex pySpark SQL dbt and ETL development with experience processing extremely large datasets
Expert in applying SCD types on S3 data lake using Databricks/Delta Lake
Experience with data model principles and data cataloging
Experience with job scheduler Airflow or similar
Demonstrated ability to analyze large data sets to identify gaps and inconsistencies provide data insights and advance effective product solutions
Deep familiarity with AWS Services (S3 Event Bridge Kinesis Glue EMR Lambda)
Experience with data warehouse platforms such as Redshift Databricks Big Query Snowflake
Ability to quickly learn complex domains and new technologies
Innately curious and organized with the drive to analyze data to identify deliverables anomalies and gaps and propose solutions to address these findings
Thrive in a fast-paced startup environment
Desirables
Experience with customer data platform tools such as Segment
Experience with data streaming such as Kafka
Experience using Jira GitHub Docker CodeFresh Terraform
Experience contributing to full lifecycle deployments with a focus on testing and quality
Experience with data quality processes data quality checks validations data quality metrics definition and measurement
AWS/Kafka/Databricks or similar certifications
What the job involves
Collaborate with product managers data scientists data analysts and engineers to define requirements and data specifications
Plan design build test and deploy data warehouse and data mart solutions
Lead small to medium size projects solve data problems through documentation design and creation of ETL jobs data marts
Increase the usage and value of the data warehouse and ensure the integrity of the data delivered
Develop and implement standards and promote their use throughout the warehouse
Develop deploy and maintain data processing pipelines using cloud technology such as AWS Kubernetes Airflow Redshift Databricks EMR
Define and manage overall schedule and availability for a variety of data sets
Work closely with other engineers to enhance infrastructure improve reliability and efficiency
Make smart engineering and product decisions based on data analysis and collaboration
Act as an in house data expert and make recommendations regarding standards for code quality and timeliness
Architect cloud-based data pipeline solutions to meet stakeholder needs
We believe everyone deserves affordable and convenient healthcare. Our mission is to build better ways for people to find the right care at the best price. Our technology gives all Americans — regardless of income or insurance status — the knowledge, choice, and care they need to stay healthy. GoodRx supports all Americans with their healthcare challenges. Since 2011, we’ve helped Americans save over $35 billion on prescriptions via savings cards, our website, and the GoodRx app, which is one of the most downloaded medical apps on both the Google Play and Apple App Store. We’re a customer-first company. That means making sure all our products and features are built around you. This approach guides how we operate, make decisions across our businesses, and work on new ideas for our customers. In 2021, an estimated 46 million U.S. adults were not able to afford needed care. We’re working to narrow this access gap. We have saved Americans over $30 billion on their medications.
Company values
We’re a customer-first company
Making sure all our products and features are built around you
This approach guides how we operate
Make decisions across our businesses
Work on new ideas for our customers
We’re working to narrow this access gap
We have saved Americans over $30 billion on their medications.