• The Data-Driven Imperative

    Agencies today strive to provide constituents with better products and services in budgetary environments that, increasingly, ask leaders to spend less. This might sound paradoxical, even impossible, but it’s not. 

    Government has a vast expanse of data at its fingertips, and the ability of agencies to share—and capitalize on—that data is growing. In fact, the White House has made it clear that it is committed to the continuation of open data practices, making highly actionable data sets across the federal government available to agencies that can then use the data for enterprise transformation.

    These insight cards will help you better understand the power of data and how your agency, partnering with Cloudera, can:

           ♦   Lay the groundwork for a migration to cloud

           ♦   Modernize your data infrastructure with an enterprise data hub

           ♦   Protect colossal amounts of data

           ♦   Instrument the business of government

           ♦   Connect with constituents

           ♦   Jump start your journey to success

    Start
  • 1/7

    #1: Laying the groundwork

    Moving to cloud is a less costly and more convenient way of handling government data.

    In a fixed-infrastructure model, frustrations can arise because organizations rely heavily on IT to provide hardware and prepare the environment, decreasing agility. Cloud enables the self-service that many agencies seek, and cloud’s elasticity allows them to address each task with exactly the amount of bandwidth needed.

    Cloudera SVP of Products Charles Zedlewski spoke about the benefits of migrating to cloud at the 2017 Cloudera Government Forum.

    It’s rare that you find something interesting off a single data source. Government applications become most useful when they can draw from multiple, up-to-date data sources.
    Charles Zedlewski, SVP of Products at Cloudera

    In the cloud, spinning up new environments for disparate applications is much easier than doing so within a fixed infrastructure, meaning different application clusters can operate in purpose-built environments. Moreover, many cloud environments can leverage the same cloud storage, simplifying the data-sharing process between applications. 

    Operating in the cloud is more convenient for government operatives and enables agencies to more rapidly provide cost-effective constituent services. This allows agencies to cut back on running services when they don’t need to be running and instead only use applications as much and as often as necessary. There’s also no worry about resource contention with cloud, because users can self-service clusters for their particular workload, eliminating the risk of bottlenecking.

    The benefits of the government using public cloud are far-reaching and are exactly why federal agencies now operate on a cloud-first policy as well as why major cloud-computing companies increasingly tailor their services to government clients.

    But before federal agencies can truly rule the cloud, they face the task of establishing one way to access, view, and use data no matter where it is stored: a cloud-native machine-learning (ML) and analytics solution that’s easy to use, works with any kind of data regardless of where it’s stored, and delivers self-service agility without sacrificing security, governance, or scalability. By providing a single administration interface for central IT, this delivers agility for end users and elastically scales clusters, all while ensuring auditability.

  • 2/7

    #2: Modernizing government business

    The Digital Accountability and Transparency Act (DATA Act) continues to be implemented across the public sector. As agencies further grasp and disclose their financial information, acting federal CIO Margie Graves believes all agencies will be in a more prepared place to begin (or redouble their efforts toward) the long-needed process of modernizing legacy systems. 

    From a data-storage perspective, this could mean deploying a solution that allows for the central management of all the data belonging to an agency. One particularly deft expression of this solution is the enterprise data hub (EDH), a central location for all the data pertinent to an agency’s internal and external functions. Data stored in an EDH exists in its original form but is fully integrated with the existing infrastructure and tools of the agency. Beyond data storage, the EDH acts as an ML and analytics platform, so you can start to recognize value from your data almost immediately.

    Photo: imaginima (iStock)

    For years, data has existed in agency silos, making it increasingly difficult to answer multi-factor analytic questions—especially as that siloed data inevitably grew in volume, variety, and velocity. But an EDH can break down silos, tame unruly data, and transform the government enterprise.

    And with programs like Cloudera Manager within Cloudera Enterprise, transitioning to a modern platform with advanced analytics can turn data into a strategic asset for government organizations.  An EDH enables self-service and agility for users while maintaining a secure and compliant environment.

  • 3/7

    #3: Protecting your agency and lowering risk

    Government agencies are charged with knowing their citizens and business operations inside-and-out in order to provide high-quality, low-lift services (more on how an EDH optimizes the mission in the next card.) In order to do so, however, agencies must secure masses of highly sensitive information ranging from health data and other personally identifying information (PII) to classified government documents. Unfortunately, this means government databases are a glaring target for bad actors. It also means agencies are subject to rigorous compliance regulations like the Federal Information Security Management Act (FISMA) and the Health Insurance Portability and Accountability Act (HIPAA), both of which set strict guidelines for federal cybersecurity.

    But being a clear target does not have to mean being a vulnerable one. The need for user authentication and authorization, data protection and auditing is standard, and the open source community has responded appropriately. Powerful, compliance-ready tools like Cloudera Manager and Cloudera Navigator defend agencies’ information and meet necessary requirements right out of the box—but even with reliable protections, proactively identifying unknown threats remains a challenge.

    Photo: gorodenkoff (iStock)

    Cloudera’s cybersecurity solution, based on Apache Spot, enables anomaly detection, behavior analytics, and comprehensive access across all enterprise data using an open, scalable platform that allows organizations to build custom solutions as well as deploy packaged applications on top of one shared, enriched data set. Using the diverse open-source community to accelerate shared innovations, while changing the economics of cybersecurity, agencies can come together to fight back against cyber threats.

    But security is about more than internal threat detection. As it stands, over 30 percent of government employees lack confidence in their organization’s abilities to detect and protect against fraud, waste and abuse, a staggering figure considering what government data can do when properly utilized. In the fall of 2016, HHS Chief Data Officer Caryl Brzymialkiewicz led a team that cracked a $1 billion fraud, waste, and abuse case within HHS thanks to data analysis. She told her story at the 2017 Cloudera Government Forum.

    Fraud detection is an area where government leaders can learn from their neighbors in the private sector. For instance, Choice Hotels Principal Software Engineer Robert Bushman explained in a January 2017 webinar with Cloudera that his company was able to quickly and effectively cull data sets of customer information in order to pinpoint fraudulent activity. 

    Not unlike the federal government, Choice Hotels deals with data from a seemingly overwhelming number of sources: travelers at 6,400 hotels in 40 countries across the globe. Yet, using Cloudera’s platform, Bushman and his team analyzed specific sets of traveler data such that “we can see patterns of behavior in our customers that lead to—or that are a sign of—fraudulent activity,” he said. Public sector leaders easily could follow suit, capitalizing on their data to zero in on fraud and begin the work of stamping it out.

  • 4/7

    #4: Connecting your organization

    As government and education organizations modernize, vast opportunities present themselves that open new avenues of citizen service and drive change in how agencies can inform decision making.

    Sensors and meters embedded in the world around us are constantly amassing data: energy expenditure, weather patterns, and incident frequency make up just a few broad categories, but agencies have access to an astounding variety of information flowing in from an equal variety of sources. 

    Internet of Things (IoT) technologies are the reason for the growth in data at many government agencies. Because the public sector hopes to use that data to inform through ML and analytics, decision makers are on the lookout for solutions capable of processing multiple data formats. It’s time to focus not on the challenges this expanding data creates but instead on the opportunities it presents.

    Photo: NicoElNino (iStock)

    In health care, for example, practitioners can stream data from a patient’s bedside and export it to a given caregiver station. Using IoT devices to ingest and share patients’ vital signs, prescriptions, and someday even monitor their blood in real time (seriously) increases both the precision of care and the efficiency of the health care facility.

    But IoT benefits extend beyond the health sector. Vehicle manufacturers like Sikorsky can now build military helicopters with self-monitoring sensors that enable predictive maintenance—that is, knowing enough about the behavior and stamina of a system to accurately schedule necessary maintenance. This pre-empts breakdowns and other kinds of system failure, an enormous opportunity for the Department of Defense.

    Transportation also stands to benefit from IoT if states and localities can seize the data generated by traffic patterns during busy times of day, when experiencing inclement weather, or to optimize the routes of government service vehicles. Yet, despite the clear opportunity for innovation, a June 2017 Brookings report concluded that “many publicly generated data sets are not yet part of public planning processes.” In government, IoT is untapped potential.

    This infographic does an excellent job showcasing the depth and breadth of the benefits IoT technologies can bring to daily life. It has the power to optimize the way government does business as well as the way constituents interact with government.

  • 5/7

    #5: Knowing your constituents

    At the core of government’s mission is the constituent. In reality, that means any one of hundreds, thousands, millions of individual Americans, each with a unique set of interests, desires, and behaviors that inform the way they’ll interact with government services. Constituent insight, it follows, is essential to government’s mission.

    With a centralized repository of structured and unstructured data—sometimes called a “data lake”—and an EDH on top of it for processing, analyzing, and connecting to other applications, agencies can create “360-degree profiles” of their constituents. These 360 profiles would include traditional identifying information (think demographics) coupled with behavioral information (think a list of government benefits or even the last government website a user visited).

    By mining these datasets, agencies can satisfy the needs of citizens as a group as well as provide personalized experiences for every citizen.

    In 2011, researchers at Dartmouth began the Durkheim Project, an ongoing effort to study rates of suicide among veterans and to increase the efficacy of preventative methods. Eventually collaborating with analytics firm Patterns and Predictions (P&P) and the Defense Advanced Research Projects Agency (DARPA), the team developed a machine learning tool that predicts suicide risk with a statistically significant 65% accuracy.

    How? The project’s subjects produced a slew of structured and unstructured data—from demographic information to clinical assessments—that contributed to an individualized risk profile. The Durkheim Project used Cloudera solutions like Search and Impala to ingest up to 100,000 veteran profiles over the course of the study and processed more than a terabyte of data daily, in real time, allowing them to see impactful results quickly.

    Applying analytics to 360-degree profiles empowers agencies to draw incredible, consequential conclusions about citizens at large. The same methods behind the Durkheim Project can also be used to direct an individual to government services they might want or need based on the other services they use—this is like the “recommended for you” sections of Netflix or Amazon, and it can be especially beneficial for underserved or under-engaged populations like immigrants, the elderly, and those with low income. Behavior, like height, education, and income, is actionable data—and harnessing it can have profound implications, whether that’s preventing the loss of life or contributing to an uptick in civic engagement.

  • 6/7

    #6: Building the culture and growing your agency team

    Becoming a data-driven organization requires far more than great technology. It requires agencies to implement best practices around people and processes. Perhaps the most important element to success is having a strong business sponsor for the overall big data mission and business stakeholders for individual use cases. Identifying strong leaders in the organization to take charge during this transformation is key to success.

    Developing the right team is pivotal in this process as the traditional business intelligence (BI) and analytics model has evolved in the big data world. The engineering team is now strategic because data itself is more agile, and this team must be responsible and accountable for the organizational data. It’s important that agencies build tightly aligned teams with a mix of industry experts and creative innovators to promote success.

    Another component to success is the focus on agility and lean development. Successful projects start small, fail often, and embrace an “iterate to success” approach. The fundamental concepts of agile methodology are epics, stories, tasks, scrum teams, and sprints. Become adept at using epics to document broad concepts and requirements for the EDH infrastructure, for data collection and management, and for use case development.

    Photo: gilaxia (iStock)

    Analytics generate reports, and big data generates actions. Successful businesses span organizational gaps by building a bridge between development and operations (DevOps), and governments are no different. This concept was successful in the early days of web development, when the need arose to have a hybrid team that sat between web development and the IT department. DevOps concepts have since moved more broadly into the world of application development, and they are being deployed to manage data that needs to move to production as well as the models used to create insights from that data.

    One of the most important aspects of success with big data is data governance. Successful operations govern at the level of the data. Because an EDH lets users collect data in full fidelity, it requires less governance as the data enters the systems and is used by data engineers, data scientists, or app developers—it’s when the data crosses the DevOps line into production that more governance is put in place. This is purposeful: You want your power users to have full access in their development environments, but you also want to lock down production for use by many analysts and business users. You must fully understand the lineage of the data and be able to audit it once it has reached the production level.

  • 7/7

    #7: Conclusion

    This is an inflection point. Agencies can master their data—the tools and means are certainly available—or they can let it swallow them whole. By aggregating and mining data already available to them, agencies can hone in on constituent needs, instrument and improve critical functions using the Internet of Things, secure their data in transit and at rest, modernize their platforms, and operate in the cloud at peak efficiency. Data can transform government business, and it’s time government leaders took the plunge. 

    Start Over