ModernGov recently spoke with Michael Ho, Vice President for Software AG Government Solutions, about the key issues facing government agencies when it comes to handling, analyzing, and correlating data they collect from innumerable sources. That data collection and storage has exploded over the past few years, and, in fact, 90 percent of the data on the Internet today was created in the past two years. We generate over a billion gigabytes of data every day on the Internet alone.
Agencies can quickly become overwhelmed with that amount of data, making it almost useless to them. Ho notes that there are solutions that can help agencies ensure that the data collected is analyzed in a timely manner and shared with appropriate analysts securely so they can leverage their data for mission success.
ModernGov: Government agencies are literally swamped with information. What are some of the key issues they face in handling big data?
Michael Ho: One of the biggest challenges facing federal agencies is getting useful information out of their data that can help them perform their missions more effectively. Timing can be key to solving this challenge. If you can bring data to an end user in a time period when the data is most relevant and actionable to their particular situation or scenario, then the value of that data is exponentially greater. To accomplish this, you need to get the data close enough to the end user that he or she can make sense of it and use it when they need it most. The reality today is that data that is collected either gets lost in transition or it never goes anywhere out of the system that it lives in.
For example, let’s say an agency is tracking a potential threat in a major metropolitan area. One of the agency’s analysts is charged with monitoring and flagging any suspicious social media activity that may be related. This is an enormous amount of data to sift and search through, especially in real-time. However, if the analyst is able to leverage a secondary system – one that provides a list of potential targets – they can make their searches, analysis and correlation more efficient. The analyst uses a normally static, relatively small set of data (potential targets) to gain relevancy within the scope of “big data” (social media streams) at that particular moment in time.
ModernGov: What insights can be derived from leveraging data and not leaving it sitting in a database?
MH: You want to get to the nth degree of understanding. Whether you are looking at financial data, intelligence information, or even just internal enterprise data, you want to create data linkages that help create a more holistic picture of what is happening in the scenario you are studying. The importance of the relevancy issue – getting your data in front of your users within its window of relevance – is that it allows them to establish that nth degree and make the data they already have more useful.
An example would be an agency that has two data collection systems running at the same time. One is the HR employee resource management system and the other is a physical system – the card reader that personnel use to gain access to the building. By themselves, they have their own uses. The HR system takes care of day-to-day employee management and the card reader determines access to areas of the building. When you tie them together, however, you can get much more detail. You can discover which departments are working the longest or which personnel have to go in and out of the building most often. In tying these two systems together, you can help streamline staffing and/or update policies around overtime. You can even determine scenarios for insider threat assuming an individual suddenly changes their travel pattern within the building after they incur negative reviews that are entered in the HR system. Or you can simply correlate the data to improve the way you communicate with employees or create a rule that ”x” percentage of the building will receive certain types of information because it is relevant to them. There are several ways to use the data that is sitting there to help make them more efficient and get more done.
ModernGov: What other benefits do data correlation and visualization bring to government agencies?
MH: We all strive to have computer systems that do the correlations for us, but at the end of the day it is the human intelligence that is applied to the data that really does the correlation. Where visualization becomes critical is when users are trying to correlate data. For example, you want to present data in either ways that it hasn’t been presented before or in a way that is easy to consume. You don’t want an analyst digging through a million record spreadsheet with 50 columns of data. You want that analyst to look at a chart, a map, or a heat map of the data so they can get a quicker understanding of it. If they want more details, they can dig into it. Visualization is also critical to the window of relevance. Data can fall out of its window of relevance because we can’t get to it fast enough because we are stuck analyzing all the other things that are out there. We need to ask, how can we leverage visualization to make these analyses and correlations faster and more efficient to allow us to get to more data within this window of relevance?
ModernGov: Can you give us a few tips for how can government agencies can unshackle their data?
MH: Unshackling is a broad question because there are different ways that data is locked down in agencies. The first way is that the data is just sitting in its source system and is not being used. There is a lot of information out there like that, especially in systems that use physical components for data collection, such as sensors, readers and scanners used for monitoring a particular situation. Often that data doesn’t go anywhere beyond the system for which it was designed.
When data doesn’t go anywhere, it isn’t as useful as it could be to the overall enterprise, so there needs to be a lot of creative thinking surrounding data. What systems do we have out there that we don’t traditionally think of as data sources, but have data? Could we use that data to make the agency more efficient, help our employees do a better job, or spend resources more efficiently? Creative thinking is an important element – I have this data, so what can I do with it?
As I mentioned, many systems have been set up to gather or create data, but they don’t talk to each other. Our challenge has become to get that data integrated to correlate it and extract relevancy out of the data. Traditionally, you would get the people who “own” the data in a room to talk about it and see if they come up with some agreement about how they can use each other’s data. This is a very long and tedious process – highly inefficient. Luckily, changes to systems architecture over the past decade now allow people to expose their data without going through these arduous manual processes. You want to create an enterprise sharing mechanism, an integration server that is a broker to other places within the enterprise. With something like this, a user doesn’t necessarily have to have an account on your system to get to your data. All that person needs is an intermediary system managed by security and IT processes that allow for an exchange of data within pre-prescribed parameters. Then, users do not have to manually determine what to send to each other, what agreements have to be made, or what data will be shared.
The intermediary system handles the sharing protocols, and there are no direct connections between data sources, which can be very dangerous. No one loses track of who has what access to what data, and data owners know who is looking at data and who is using data.
It is vitally important to establish processes alongside centralized integration that provide standards for requesting, handling, and monitoring data access and usage. Having these processes in place and having a system that automates those processes helps you manage your data at a technical level, which in turn allows an agency to more confidently expose and share data both internally and with other agencies.
Interested in learning more? Visit: http://www.softwareaggov.com/fast-big-data and dive into how your organization can leverage in-memory data management, analytic & decision making tools and real-time visualization.