Boll and Branch Builds Enterprise Data Warehouse on Google BigQuery

Boll and Branch, New Jersey, USA, https://www.bollandbranch.com, is a premium bedding and sheet e-commerce retailer with a fast-growing online business had the need for a better data-driven business decision-making approach.  Existing BigQuery based data mart had limited scope and capabilities. The need for a comprehensive Enterprise Data Warehousing solution has been identified.

Myers-Holum Inc. led the Enterprise Data Warehousing implementation for Boll and Branch using Google Cloud Platform serverless technologies.  Data was landed daily directly into BigQuery from multiple SaaS operational systems such as Netsuite, Shopify, Zendesk, Iterable using stitch.com serverless platform, as well as directly from Segment audience tracking system. Monthly data files from 3rd party providers were being landed onto Google Cloud Storage. Cloud Scheduler, Cloud Functions, and BigQuery SQL were used to serverlessly process landed data into staging, consolidated, prepared layers. Data Studio dashboards were implemented to show analytical reports.  Dimensional data modeling techniques were applied to build the consolidated layer with multiple fact, type 1 and type 2 slowly changing dimension tables.

With the new Enterprise Data Warehouse in place Boll and Branch was able to get additional and deeper insights into business performance and plan for continued growth.

PepkorIT Migrates Oracle Enterprise Data Warehouse to Google BigQuery

Steinhoff International Holdings, Cape Town, South Africa, www.steinhoffinternational.com, is a global retailer with stores across multiple regions. Steinhoff’s IT division PepkorIT was responsible for maintaining an existing on-prem multi-tenant Oracle-based Enterprise Data Warehouse (EDW) with custom SQL script ETL pipelines, daily store transaction activity batch loads, and various monthly consumer information data feeds, and analytical BI dashboards.

Legacy Oracle EDW was running out of capacity and needed to improve time to insights from days to hours and minutes.

The decision was made to migrate EDW to Google Cloud Platform and needed planning, design, implementation assistance. Myers-Holum Inc. (MHI) led the project to define the future Google Cloud-Based solution architecture and implement a data ingestion framework using Data Flow that reused the same pipeline for both batch ingestion from Google Cloud Storage, and real-time ingestion from OLTP database binary logs streaming through Google Cloud PubSub into BigQuery.

The MHI solution centered around self-healing Data Flow pipelines that allowed for schema changes over time with minimum operational intervention and automatic data reprocessing. The solution included sensitive data masking, balance and control system tables, full data lineage for data landed into GCP, data quality rules implementation, BigQuery schema design based on Myers-Holum industry best practices, downstream data processing for BI and Analytics use cases, and job monitoring using Stackdriver and Datastudio dashboards.

Google Cloud BigQuery-based EDW allowed Steinhoff to reduce time to insights from days to minutes. Google Cloud serverless technologies such as Data Flow provided scalable infrastructure to ingest batch and real-time data quickly and reliably while reducing CapEx and Opex costs.

Teradata Migration to Google BigQuery

As a Teradata Enterprise Data Warehouse owner, are you tired of:

  • buying and hosting expensive proprietary hardware,
  • patching operating systems,
  • installing specialized database software,
  • managing database servers,
  • tuning database parameters,
  • planning upgrades and downtime,
  • worrying about increased data load times and ever-increasing data consumption needs,
  • dreading that multi-million $ yearly license renewal.

4 Facts about Google BigQuery

  • Did you know that a Google BigQuery project comes with a default 2,000 query execution slots that can be extended to more slots upon request?  Does your on-premise Teradata Data Warehouse infrastructure have 2,000 slot elasticity to run analytical queries?
  • Did you know that Google Cloud Platform bills separately for storage and query data processing?  Are you overpaying Teradata for either compute or storage capacity due to static hardware configuration and pricing models?
  • Did you know that Google Cloud Platform utilizes Petabit network to distribute your data across multiple regions for redundancy and high availability?  Are you worried about your Teradata cross-data-center Data Warehouse failover and disaster recovery?
  • Do you know how much time, effort and resources you are spending on managing Teradata Data Warehousing on-premise infrastructure complexity instead of focusing on data, insights and your customer?

Advantages of Google BigQuery

Migration of very large Data Warehouses from a Teradata platform to a Google BigQuery offers significant potential advantages:

  • The elastic scalability of the cloud infrastructure eases cost/performance tradeoffs
  • Data ingestion patterns can be simplified
  • Integration with sophisticated cloud-based analytical toolsets is readily supported
  • A serverless NoOps environment frees infrastructure maintenance burden allowing to refocus resources on data and business insights

Deep Dive

Read our blog here where a series of articles, collaboratively written by data and solution architects at Myers-Holum, Inc, and Google, describe an architectural framework for conversions of data warehouses from Teradata to the Google Cloud Platform.  The series will explore common architectural patterns for Teradata data warehouses and outline best-practice guidelines for porting these patterns to the Google Cloud Platform toolset.

Myers-Holum is here to help

Myers-Holum can assist with navigating considerations of performance, cost, reliability, and security for including cloud platforms in your mix for data warehouse deployments.

MHI uses a model-based approach and metadata-wise tools to efficiently migrate data warehouse components from traditional to cloud platforms, translating schemas and ingestion and consumption processes for optimal performance in the new architecture. We maintain high standards for metadata integrity and governance, data lineage, and code discipline.

Assessment

We evaluate where it makes sense to include Cloud platforms in your Data Warehouse environment, and assess the complexity of making the migration.  The Assessment is focused on core business requirements and three different Teradata data warehouse implementation architectural styles, canonical data models used if any, layer architectures and semantic layers implementations.  We then review existing batch and streaming source data capture to preserve your existing investment.  We analyze data consumption patterns including frequency, resources, and volume.  And finally, we review your Data Governance programs in place.

Future State

We propose Google Cloud Platform products to be used and best practices to be applied based on the assessment.  We define data modeling patterns and detail examples for converting Teradata semantic layer and star schema into BigQuery repeated nested structures.  We suggest source data capture approach ETL vs. ELT vs. UPM Dataflow, and tooling within Google Cloud Platform utilizing either native Cloud Dataflow capabilities and/or 3rd party integration tools, with lift and carry as much as possible.  For data ingestion into the Google BigQuery, we define connectivity to on-premise and cloud data sources. For data consumption, we recommend an approach that utilizes best of breed solution either using existing analytics and reporting tools or newly available analytics tooling to democratize of data analytics.  It’s important to define data security and access models, and auditing approach for enterprise data in the cloud. We suggest adjustments to Data Governance programs for the cloud.  Finally, we recommend aspirational machine learning data insights opportunities utilizing CloudML, Tensoflow, Google Cloud AI APIs.  

Roadmap

We work with your expert staff to create a business, financial, architectural and technical roadmap to migrating DW to the cloud.  Special attention is paid on ROI and iterative delivery to show progress early and often.  

Implementation

We carry out the migration following a carefully planned, staged implementation strategy, delivering real business benefit at each stage.

Contact Us

Contact us at cloudinfo@myersholum.com or 646.844.4493 to learn more about Teradata to BigQuery migrations!