Loading...
Big Data for Dummies
Big data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data m...
Main Author: | |
---|---|
Other Authors: | |
Format: | Printed Book |
Published: |
New Delhi
Wiley India
2013
|
Series: | --For dummies.
|
Subjects: |
Table of Contents:
- pt. I. Getting started with big data
- 1. Grasping the fundamentals of big data
- Evolution of data management
- Understanding the waves of managing data
- Creating manageable data structures
- Web and content management
- Managing big data
- Defining big data
- Building a successful big data management architecture
- Capture, organize, integrate, analyze and act
- Architectural foundation
- Performance matters
- Traditional and advanced analytics
- 2. Examining big data types
- Defining structured data
- Exploring sources of big structured data
- Understanding the role of relational databases in big data
- Defining unstructured data
- Exploring sources of unstructured data
- Understanding the role of a CMS in big data management
- Looking at real-time and non-real-time requirements
- Putting big data together
- Managing different data types
- Integrating data types into a big data environment
- 3. Old meets new: distributed computing
- Brief history of distributed computing
- Giving hanks to DARPA
- The value of a consistent model
- Understanding the basics of distributed computing
- Why we need distributed computing for big data
- The changing economics of computing
- The problem with latency
- Demand meets solutions
- Getting performance right. pt. II. technology foundations for big data
- 4. Digging into the big data technology components
- Exploring the big data stack
- Redundant physical infrastructure
- Physical redundant networks
- Managing hardware : storage and servers
- Infrastructure operations
- Security infrastructure
- Interfaces and feeds to and from applications and the internet
- Operational databases
- Organizing data services and tools
- Analytical data warehouses
- Big data analytics
- Big data applications
- 5. Virtualization and how it supports distributed computing
- Understanding the basics of virtualization
- The importance of virtualization to big data
- Server virtualization
- Application virtualization
- Network virtualization
- Processor and memory virtualization
- Data and storage virtualization
- Managing virtualization with the Hypervisor
- Abstraction and virtualization
- Implementing virtualization to work with big data
- 6. Examining the cloud and big data
- Defining the cloud in the context of big data
- Understanding cloud deployment and delivery models
- The cloud as an imperative for big data
- Making use of the cloud for big data
- Providers in the big data cloud market
- Amazon's public Elastic Compute Cloud
- Google big data services
- Microsoft Azure
- OpenStack
- Where to be careful when using cloud services. pt. III. Big data management
- 7. Operational databases
- RDBMs are important in a big data environment
- PostgreSQL relational database
- Nonrelational databases
- Key-value pair databases
- Riak key-value database
- Document databases
- MongoDB
- CouchDB
- Columnar databases
- HBase columnar database
- Graph databases
- Neo4J graph database
- Spatial databases
- PostGIS/OpenGEO Suite
- Polyglot persistence
- 8. MapReduce fundamentals
- Tracing the origins of MapReduce
- Understanding the map function
- Adding the reduce function
- Putting map and reduce together
- Optimizing MapReduce tasks
- Hardware/network topology
- Synchronization
- File system
- 9. Exploring the world of Hadoop
- Explaining Hadoop
- Understanding the Hadoop Distributed File system (HDFS)
- NameNodes
- Data nodes
- Under the covers of HDFS
- Hadoop MapReduce
- 10. The Hadoop foundation and ecosystem
- Building a big data foundation with the Hadoop ecosystem
- Managing resources and applications with Hadoop YARN
- Storing big data with HBase
- Mining big data with Hive
- Interacting with the Hadoop ecosystem
- Pig and Pig Latin
- Sqoop
- Zookeeper
- 11. Appliances and big data warehouses
- Integrating big data with the traditional data warehouse
- Optimizing the data warehouse
- Differentiating big data structures from data warehouse data
- Examining a hybrid process case study
- Big data analysis and the data warehouse
- The integration lynchpin
- Rethinking extraction, transformation, and loading
- Changing the role of the data warehouse
- Changing deployment models in the Big data era
- The appliance model
- The cloud model
- Examining the future of data warehouses. pt. IV. Analytics and big data
- 12. Defining big data analytics
- Using big data to get results
- Basic analytics
- Advanced analytics
- Operationalized analytics
- Monetizing analytics
- Modifying business intelligence products to handle big data
- Data
- Analytical algorithms
- Infrastructure support
- Studying big data analytics examples
- Orbitz
- Nokia
- NASA
- Big data analytics solutions
- 13. Understanding text analytics and big data
- Exploring unstructured data
- Understanding text analytics
- Difference between text analytics and search
- Analysis and extraction techniques
- Understanding the extracted information
- Taxonomies
- Putting your results together with structured data
- Putting big data to use
- Voice of the customer
- Social media analytics
- Text analytics tools for big data
- Attensity
- Clarabridge
- IBM
- OpenText
- SAS
- 14. Customized approaches for analysis of big data. pt. V. Big data implementation
- 15. Integrating data sources
- Identifying the data you need
- Exploratory stage
- Codifying stage
- Integration and incorporation stage
- Understanding the fundamentals of big data integration
- Defining traditional ETL
- Data transformation
- Understanding ELT : extract, load, and transform
- Prioritizing big data quality
- Using Hadoop as ETL
- Best practices for data integration in a big data world
- 16. Dealing with real-time data streams and complex event processing
- Explaining streaming data and complex event processing
- Using streaming data
- Data streaming
- The need for metadata in streams
- Using complex event processing
- Differentiating CEP from streams
- Understanding the impact of streaming data and CEP on business
- 17. Operationalizing big data
- Making big data a part of your operational process
- Integrating big data
- Incorporating big data into the diagnosis of diseases
- Understanding big data workflows
- Workload in context to the business problem
- Ensuring the validity, veracity, and volatility of big data
- 18. Applying big data within your organization
- Figuring the economics of big data
- Identification of data types and sources
- Business process modifications or new process creation
- The technology impact of big data workflows
- Finding the talent to support big data projects
- Calculating the return on investment (ROI) from big data investments
- Enterprise data management and big data
- Defining enterprise data management
- Creating a big data implementation road map
- Understanding business urgency
- Projecting the right amount of capacity
- Selecting the right software development methodology
- Balancing budgets and skill sets
- Determining your appetite for risk
- Starting your big data road map
- 19. Security and governance for big data environments
- Security in context with big data
- Assessing the risk for the business
- Risks lurking inside big data
- Understanding data protection options
- The data governance challenge
- Auditing your big data process
- Identifying the key stakeholders
- Putting th right organizational structure in place
- Preparing for stewardship and management of risk
- Setting the right governance and quality policies
- Developing a well-governed and secure big data environment. pt. VI. Big data solutions in the real world
- 20. The importance of big data to business
- Big data as business planning tool
- Planning with data
- Doing the analysis
- Checking the results
- Acting on the plan
- Adding new dimensions to the planning cycle
- Monitoring in real time
- Adjusting the impact
- Enabling experimentation
- Keeping data analytics in perspective
- Getting started with the right foundation
- Getting your big data strategy started
- Planning for big data
- Transforming business processes with big data
- 21.
- Analyzing data in motion : a real-world view
- Understanding companies' needs for data in motion
- The value of streaming data
- Streaming data with an environmental impact
- Using sensors to provide real-time information about rivers and oceans
- The benefits of real-time data
- Streaming data with a public policy impact
- Streaming data in the healthcare industry
- Capturing the data stream
- Streaming data in the energy industry
- Using streaming data to increase energy efficiency
- Using streaming data to advance the production of alternative sources of energy
- Connecting streaming data to historical and other real-time data sources
- 22. Improving business processes with big data analytics: a real-world view
- Understanding companies' needs for big data analytics
- Improving the customer experience with text analytics
- The business value to the big data analytics implementation
- Using big data analytics to determine next best action
- Preventing fraud with big data analytics
- The business benefit of integrating new sources of data. pt. VII. The part of tens
- 23. Ten big data best practices
- 24. Ten great big data resources
- Hurwitz & Associates
- Standards organizations
- The Open Data Foundation
- The Cloud Security Alliance
- National Institute of Standards and Technology
- apache Software Foundation
- OASIS
- Vendor sites
- Online collaborative sites
- Big data conferences
- 25. Ten big data do's and don'ts
- Glossary.