Why Data Lakes Are More Powerful for the DOD than Commercial Industry

Visual Generation/Shutterstock.com

The ability to share information about fleets, weapons systems, supply chain and adversarial intelligence is too much of a draw to ignore the power of a data lake.

For the last few years in the commercial industry, the big topic of conversation has been around big data’s impact to return on investment while increasing the bottom line. A recent report predicts the data lake market will balloon more than 250% to $20.1 billion by 2024. Why is such a hot approach to using data not being deployed within our vital federal and defense sectors? In an industry where the ability to ingest, analyze and act on data could mean the difference in mission-critical efforts, why hasn’t the Defense Department taken advantage of data lakes? The ability to share information about fleets, weapons systems, supply chain and adversarial intelligence is too much of a draw to ignore the power of a data lake.

Some of the past challenges that have surely hindered the military from harnessing data lakes can be understood by looking at a few key factors:

  • Legacy systems and older infrastructures to workaround. 
  • Highly classified data and data silos. 
  • Interoperability issues and multiple vendors deployed.
  • Size of data and existing bandwidth in the federal and defense sectors.
  • Understanding how and what to prioritize.
  • Disconnect in collaborative usage across the various aspects of the military.

Advantages of Data Lakes in the DOD and Federal Sector

Past criticisms of data lakes are that they often become a massive “data graveyard” where everything gets thrown without rhyme or reason. Part of the reason the commercial sector uses data lakes is just that: to create a crater that collects data of diverse types from all sources. 

A major benefit for all military branches in using data lakes is access to a data source that is rich in diversity but also well-aligned to the missions across the military, leveraging the standardization of data stream schemas that are used within the DOD to avoid the disorganized “data graveyard.” Using data lakes will translate into massive time savings and the ability to leverage industry-grade machine learning and artificial intelligence applications for deployment. The Navy, Army, Marine Corps, Air Force, Space Force, and most sectors within the DOD could use data lakes to drive warfare intelligence in a significant way. Imagine how much more powerful our military will be with the ability to access mission relevant information, analyze it and push it out quickly, potentially opening exponential opportunities to predict enemy moves, assess inventory, plan and execute routine or abrupt maintenance. Decision-makers could predict when and where enemy forces will act against us, putting our commanders in an advantageous position to win battles and weaken enemy threats. 

Key Elements to Implementing Data Lakes in the Federal and DOD Sectors vs. Commercial

For any military division attempting to deploy a data lake, there are a few things to address to pave the way for unleashing the power of real-time big data: 

  • Assess current infrastructure and determine which solution will work most effectively with the existing architecture. The infrastructure should scale to ingest data from all streams into the lake. 
  • Determine security needs and compliance issues. Consider using DevSecOps, so security needs are considered upfront and create a set of rules to govern the data.
  • Create sound indexing and cataloging procedures to make it easier and quicker to access and share the data.
  • Develop a mechanism that will make the data available for applications dependent on BI, ML, AI etc.

Though the federal and defense sectors have suffered from a challenging acquisition model in the past, the future looks bright because data lakes will provide a mission-critical advantage to our military services if we choose to embrace it.

Ryan Mapeso is a program manager at PMAT Inc.