The new index is produced by Dataguise and designed to follow enterprise trends in Hadoop distributions and data. The technology requirements for Big Data go beyond the ability of traditional applications to capture, manage, and secure data effectively, necessitating high-volume data processing of diverse information sets. In addition to adoption and deployment trends, the index also tracks challenges faced by organizations throughout the deployment process.
The BDBI Index will be updated regularly and is based on anecdotal survey data from IT professionals at medium and large enterprises engaged in the selection, deployment and/or oversight of Hadoop-based analytics. In the latest survey, 35 individuals were sampled at the recent Hadoop Summit held on June 26th and 27th in San Jose. The two-day event attracted a growing number of Hadoop end-users as well as industry thought leaders who showcased successful Hadoop use cases and shared development and administration tips and tricks. There was also a strong focus on educating organizations about how best to leverage Hadoop as a key component within enterprise data architecture.
The following is a select set of findings from the June BDBI Index:
• Percentage of respondents currently testing Hadoop in their organization June/2013: 55%, April/2013: 43%
• What divisions within their company use Hadoop? Sales: June/2013: 39%, April/2013: 23%; Marketing: June/2013: 21%, April/2013: 28%; Customer support: June/2013: 24%, April/2013: 23%
• What are the major challenges faced in Hadoop implementations? Lack of skills: June/2013: 31%, April/2013: 35%; Hadoop usability: June/2013: 22%, April/2013: 23%: Security management: June/2013: 24%, April/2013: 21%; Data ingestion: June/2013: 24%, April/2013: 21%
• Does your organization store sensitive data (SSNs, credit card numbers, addresses, etc.) in Hadoop? Yes: June/2013: 37%, April/2013: 33%
• How important is it to know whether sensitive data is stored in Hadoop? Important or extremely important: June/2013: 87%, April/2013: 80%
• How important is it to protect access to sensitive data in Hadoop? Important to extremely important: June/2013: 97%, April/2013: 77%
• What security technologies does the organization use or intend to use to protect Hadoop data? Access control: June/2013: 27%, April/2013: 31%; Data encryption: June/2013: 27%, April/2013: 29%; Real-time monitoring: June/2013: 24%, April/2013: 18%; Data masking: June/2013: 20%, April/2013: 18%
Highlights of the June Index reveal a sharp increase in the number of organizations testing Hadoop. Technologies under evaluation in these environments, according to the survey, include Cloudera, Hortonworks, MapR and Apache Hadoop. Data types being stored by Hadoop include both structured DBMS data (36%) and log files (55%). However, 24% of the respondents are testing/using Hadoop with structured and unstructured data.
Details on the importance of data security was also provided. 87% of the June survey respondents indicated the importance of knowing whether sensitive data was stored in their Hadoop environment. According to the survey, most respondents (97%) indicated it was either important or very important to protect sensitive data stored in Hadoop. The technologies in use by IT professionals to protect sensitive data include access control solutions, data encryption, real-time monitoring or data masking. All of these options are in roughly equal use among the survey participants and can typically be tied to a specific use case.
“The interest in Hadoop continues to rise, with the number of organizations testing the platform growing at a significant rate,” said Manmeet Singh, CEO, Dataguise. "This is being driven in part by the fact that traditional databases are not appropriate to handle the overwhelming volumes of data used for analytics. This increasing trend in data growth is moving companies to deploy Hadoop to store, manage and cull through the data. However, this has created other challenges such as the risk of exposing sensitive data stored in these environments.”
Dataguise (dataguise.com) is the leading provider of data privacy protection and compliance intelligence for sensitive data assets stored in both Big Data and traditional repositories. Dataguise’s comprehensive and centrally managed solutions allow companies to maintain a 360 degree view of their sensitive data, evaluate their compliance exposure risks, and enforce the most appropriate remediation policies, whether the data is stored on premises or in the cloud.