5 October 2016

Intelligent Data Center

The holy grail of data center is complete automation and intelligent management of all the services, infrastructure, storage, security, and data. However, to get to that point one has to think outside of the box of a standard system. Data Centers run many complex and large-scale applications that are difficult to manage. There is ultimately a requirement to manage the infrastructure at massive scale especially for Big Data and the abstractions of the Cloud. Why do we need engineers in data centers when we can converge, automate and build software that allows intelligent agents to do our work for us. How does one derive intelligence into an existing system or sub-systems? Through machine learning and the representation of knowledge. The following sections look at various areas of tackling impedance of the data center and complexities as well as towards intelligent data protection services. 

Key areas identified for data center efficiency and management:
  • data center operation automation
  • characterization and synthesis of workload spikes
  • dynamic resource allocation
  • quick and accurate identification of recurring performance problems
  • optimization of systems
  • energy resource optimisation
  • fault tolerance
  • operational readiness and maintenance
  • fundamental protection of data

Knowledge representation is already available in databases. However, this is not semantic enough for agents to understand. Going further they also help to categorise and facilitate searching for information. One immediate benefit is in smart and custom catagorization as well as for defaults and the merging of both. This can also be extended into all products and services, into a data protection ontology, as well as a knowledge representation for the entire data center. Such approaches can even be programmatically applied so agents can infer and reason on concepts and things. Another place where semantic ontologies can be applied here are towards entity management, search, and dynamic reasoning. Various reasoning approaches can be applied going further from the constraints of simple rules to more complex reasoning metaphors such as probablistic, commonsense, deductive, and inductive. Once semantics are applied, an agent can utilise BDI approach and ultimately gain from argumentation in communication via game theory. Smarter agents mean better data protection and more cohesive fault tolerance, recovery, and management of data centers. Extending this further into graph theory one can apply a complete map of everything that is happening in a data center or set of dispersed geolocated data centers for a customer using connected linked data for knowledge discovery as well as complex networks. Recommendations can be derived using various approaches such as LDA and matrix factorisation. Ultimately the semantic knowledge branches will grow and consume more data incrementally like a connected knowledge graph through clustering even applying such approaches as word2vec or glove for word embeddings. Tagged data can facilitate for topic maps and identify contextually. Probabilistic reasoning can also add value to contextual scores. Going further approaches like deep learning can extend from a semantic bayesian network into a deep belief network of understanding the complexities of data center infrastructure and harness the existing resources to build a conceptual map of the world. Another approach is to also use a global optimisation strategy in form of swarm intelligence as an inspiration from natural computation for foraging and detecting points of anomaly and increasing fault tolerance. There is even benefits here towards reducing energy consumption to facilitate from cost effective data center management. This is only touching at the surface of what automation of data center should mean. An instant advisor to engineers and managers via a mobile phone smart bot like siri or alphago that can answer questions about the data center, and provide deep insights, but at same time connect with the client in order to make their lives simpler and easier. Is just providing an informational dashboard sufficient? Perhaps, being able to interact with a virtual advisor through the dashboard may even add to the entire conversation of engineer and the objective of managing a data center effectively. Having lots of data is good, but when it truly matters is when one can automate and make that data semantically meaningful for consumption of information for humans and also reduce the burden of storage and infrastructure management through intelligent means without any manual intervention. Also, a question might arise here that the limitations of insights lies to some degree in the third-party software. However, can these products be enriched to facilitate even more contextualization and intelligence.

Additionally, such approaches can be reused outside of data center context as well such as for lead generation for sales representatives and for sentiment analysis for marketing and PR helping to find new customers in process as well as to identify more features that customers desire out of the product. Features can be provided in form of modules in a microservices platform to the client that allows them to pick and choose a tailored option that meets their data center requirements. The goal being to provide not just actionable insights but also an entire intelligent path towards the automation of the entire data center from view of data management and infrastructure.

Architecture and Implementation Ideas:
  • Knowledge Graph: Cassandra/Titan/Elasticsearch
  • Deep Learning: DL4J or TensorFlow
  • Big Data: Spark/Flink/Hadoop, Kafka, and others
  • Semantic Linked Data: DBPedia, ConceptNet (analogical reasoning), Wordnet, SKOS (Thesaurus server), Event Calculus (commonsense reasoning), Reasoners, semantic & faceted search
  • Probabilistic Reasoning: Figaro/Factorie
  • NLP: CoreNLP, UIMA, Gate, OpenNLP, Sphinx, DL4J/TensorFlow
  • Microservices: Restlet/Dropwizard,  Distributed Tracing, Service Discovery, Anomaly Detection, Anomaly Correlation, Centralised Logging, Reactive Programming, Circuit Breakers
  • Dashboard: D3, Bokeh, Seaborn, Gelphi