31 December 2019
Why DataRobot Sucks
- Limited machine learning algorithms and limited ways to optimize them for use cases
- Limited flexibility of supporting data features
- Limited Sub-selection model results from the limited choice of models
- Ugly visualization for performance and comparisons
- The blender option for ensemble methods and choices is limited
- To do anything complex it is too limited
- Doesn't replace the value of handcrafted models and the value of feature engineering process
- Why automate feature engineering especially when the process is what allows one to build a generalizable model and with a better understanding of the business cases
- Limited ways to evaluate the choice of model
- Limited ways of import/export model for build/deployment
- Tight coupling to a third-party way of doing things
- Doesn't follow the formal data science method
- In fact, doesn't even replace 99% of the work
- GDPR and governance issues when processing data through the third-party models
- Benchmarks on models are not available for public peer review
- Cost far exceeds the benefit
- Many of the models provided have less than optimal outcomes - low confidence scores
- Not good for complicated analysis, best to keep use cases as simple as possible
- Productivity gain is only with very limited and simple cases
- Invariably, most business cases have noisy data and require custom models where it will be unworkable and useless to work against complexity and ambiguity
- One needs to learn another third-party tool and be willing to trust the model solution blindly
- Quick wins and successes is not the answer nor the solution if it does not solve the business case
- The reality is there is no free or easy lunch for most business cases and one has to put in the time and effort
- No need to cheat one's way through the process
- Unconvincing autopilot feature - there is no such thing as autopilot in machine learning in fact there is no real community standards or even patterns defined so how can one jump decades in progress
- Solution is useful to people that can't code, don't like to code, don't understand the data science method, and prefer simple drag and drop options
- Data input is treated in most cases as a table, not workable for noisy unstructured data
- Often will end up with overfitted models when the whole point of machine learning is to build generalizable models
- No flexible options for transfer learning on unseen data
- And, that isn't even the end of it...
30 December 2019
27 December 2019
21 December 2019
Recruitment Agencies
There are some glaring practices in recruitment industry. Some agencies actively practice reverse discrimination by only hiring women for roles such as nurse, personal assistant, receptionist, secretary, catwalk model, and others. However, if the tables were turned and if an agency were to only hire men there would likely be an uproar. This could equally be construed as sexism on part of the organizations that these recruitment agencies are working with to fill such roles. In a society where there is now apparently 100+ genders, how does one even know that the person is a man or a woman anymore? Would a transwoman/transsexual be rejected from such agency for employment? And, should one even care what gender they are? Institutional racism is also a major issue where the recruiter tends to pick candidates based on their inherent bias. Often under the covers the practice continues in some companies in form of cultural fit as an umbrella term. In fact, it doesn't just stop at gender and race. In many cases, it emerges there are all forms of hypocrisy that makes recruitment a very humanly flawed discipline. Also, how can a recruiter profile a candidate purely on basis of a phone and email conversation? And, it is practically impossible for a recruiter to be fully aware of all the skills listed on a person's resume. Such practices have a very dampening effect on the economic demand and supply for jobs and candidates in the market. AI certainly is the way to go to avoid such recruitment risks by retargeting on the things that really matter in employment which is more around the fairness for qualifications, practical experience, and skills to conduct one's job rather than to pass a judgement on their likability factor. However, an important aspect here is that AI should not only be approached via probabilistic methods. Just imagine what would happen if one were to probabilistically cluster bias based on someone's name in order to identify their ethnicity. Can you really tell someone's background based on their name alone? What if they are of mixed background? Won't they be an outlier? AI combines both logical reasoning as well as probabilistic methods.
10 December 2019
Reality of DevOps
There are some typical attitudes that emerge among devops engineers in many organizations. Partly, it is as a result of dealing with a lot of clueless employees. And, partly, because they seem to think they are the most important group in the whole organization so develop an air of superiority while giving other business groups an impression that everything is way too complicated for people to understand. Let's face it devops is not exactly rocket science. It is a merge of two interdependent functions - development and operations. The following are some typical examples of what happens when people in the workplace interact with devops.
- They make changes without informing others, and without providing sufficient advanced notification of a planned outage
- They will tell you something is too complicated, when it's a simple case of drag/drop or window click
- They will over-exaggerate on the time it will take them to get something done
- They will rarely get it done within the agreed timescales but expect unforeseen/unplanned delays
- They will expect you to log a ticket for everything even if logging a ticket takes more time than to do the work
- They will reject everything you say, then they will say exactly what you had said as if to make out what they said was quite deep
- They will use methods and tools only they can understand and with little to no shared documentation
- They will reject the way you had done something, then approach it exactly the same way
- They will look down at you for everything even while sitting in a meeting, explaining anything is complete chore for them, but they will expect you to have the highest level of communication skills and patience with them
- When you explain something simple to them expect to explain it in french, english, swahili, plenty of other languages, and still expect them to not understand any of it, even while everyone else in room likely understood it in full
- Sometimes your login accesses might be randomly revoked and then randomly start working again, this isn't magic but the devops are likely doing their usual unplanned maintenance or config changes
- When you can't ssh into a server because the devops have randomly destroyed your instance, during their regular cleanup sessions, and you have to start all over again with your work
- When they never acknowledge their mistakes that lead to critical outage issues in production
- When the wrong model or code is deployed into production
- Using too many rigid processes and creating barriers between themselves and the user
- Using metaphors of oversimplification
- Not understanding the value of automation
- Poor methods for testing what they deliver to business
- Lack of a formal architecture evaluation
- Badly managed incident management
- Lack of discipline for conducting effective, sensible, and responsive postmortem evaluations
- Misuse of metrics that don't display the full picture and cost businesses
- They will tell you it will take at least a month to spin up and setup instances in cloud when it should probably only take them less them 2 minutes to maybe 2 days if they need to cluster and setup security groups
- They will block you from doing anything yourself, or even to assist them to speed things up
- Each member in a devops team almost works like in a silo of their own bubble of ego
- Expect them never to be available when you have an outage situation so they can feel an air of importance
Labels:
artificial intelligence
,
big data
,
Cloud
,
data science
,
deep learning
,
devops
,
machine learning
,
microservices
9 December 2019
4 December 2019
19 November 2019
KBPedia
Labels:
big data
,
data science
,
linked data
,
natural language processing
,
ontology
,
semantic web
,
text analytics
9 November 2019
7 November 2019
6 November 2019
LIDA
LIDA
Models of Consciousness (Scholarpedia)
Models of Consciousness (Wikipedia)
Neuroscience
Neuroscience Online
Harvard Neuroscience Online
Models of Consciousness (Scholarpedia)
Models of Consciousness (Wikipedia)
Neuroscience
Neuroscience Online
Harvard Neuroscience Online
5 November 2019
What is AI
What is AI? | Thinking | Acting |
Humanly | Cognitive Science | Turing Test, Behaviorism |
Rationally | Laws of Thought | Doing The Right Thing |
3 November 2019
2 November 2019
Ladder Ontologies
- Asocial Ontologies
- Social Ontologies
- Cultural Ontologies
- Oral Linguistic Ontologies
- Literate Ontologies
- Civilization-scale Ontologies
Labels:
big data
,
data science
,
deep learning
,
linked data
,
machine learning
,
natural language processing
,
semantic web
,
text analytics
1 November 2019
Java Demise
The speed with which new versions are being released every year spells the end of Java in the practical business world in the foreseeable future. There are two release schedules each year (every 6 months) which is significant. The biggest hurdle for businesses is maintenance and resources. There are many products that are still dependent on Java 8 and while there is a requirement for commercial licenses for upgrades since 2019. The other being technical debt and backwards compatibility constraints especially when the product is implemented in Java and then sold to customers. In a very short span of time there have been quite a few changes to the language and an ample set of versions. One can say that the Java release cycle has exploded in speed that the majority of the community for all practical intents and purposes will not be able to keep up. What this also means is that the ecosystem of tools and libraries take a while to upgrade making it a frustration in management for the engineering and the support teams. The Java ecosystem is huge, the fall back mechanism with lots of boilerplate code, formal testing processes from lack of design patterns baked into the language, and dependency hell is a massive hurdle with the language. It seems like gradually more and more organizations will distance themselves away from Java in order to keep maintenance costs down, meet customer expectations and demand for new product features as well as to reduce complexity especially in mobile and cloud environments. Another susceptible reason is the Oracle ownership of the language and the expectations provided in terms of license for the end user. Unfortunately, there is a love hate relationship for the language in the community. Even if the language were to reduce in interest in the community, it would still appear as the underdog from under the covers and rear its ugly head as a dependency for other languages like Groovy, Kotlin, and several open source Microservices and Big Data platforms.
Labels:
big data
,
Cloud
,
data science
,
distributed systems
,
Java
,
mobile
,
programming
,
software engineering
XLNet
Labels:
big data
,
data science
,
deep learning
,
machine learning
,
natural language processing
,
text analytics
31 October 2019
30 October 2019
Office Search
office search
hubblehq
complete office search
prime office search
workspaces
flexioffices
free office finder
office genie
coworker
croissant
included.co
instant offices
pickspace
share desk
happy desk
desksnear.me
desktime
coworking
desk pass
copass
coworking wiki
desk surfing
near desk
liquid space
gocowo
find workspaces
sneed
ofixu
qdesq
share your office
spacelist
42 floors
office list
coffices
breather
peerspace
bizly
spacewhiz
beewake
awfis
all office centers
splacer
lexc
desk camping
worksnug
seats 2 meet
coworking.coffee
commercial cafe
preferred office network
heydesk
office freedom
flexas
kontor
labs
spacesworks
thebrew
cowork
hubblehq
complete office search
prime office search
workspaces
flexioffices
free office finder
office genie
coworker
croissant
included.co
instant offices
pickspace
share desk
happy desk
desksnear.me
desktime
coworking
desk pass
copass
coworking wiki
desk surfing
near desk
liquid space
gocowo
find workspaces
sneed
ofixu
qdesq
share your office
spacelist
42 floors
office list
coffices
breather
peerspace
bizly
spacewhiz
beewake
awfis
all office centers
splacer
lexc
desk camping
worksnug
seats 2 meet
coworking.coffee
commercial cafe
preferred office network
heydesk
office freedom
flexas
kontor
labs
spacesworks
thebrew
cowork
28 October 2019
Curse of Dimensionality
As you increase the number of input features, the combination of inputs can grow exponentially. As the combinations grows, each training sample covers a smaller percentage of possibilities. The result being, as you add features, you need to increase the size of your training set, which may be exponentially. As the number of dimensions goes up, the model must train on significantly more data in order to learn an accurate representation of the input space.
Labels:
big data
,
data science
,
deep learning
,
machine learning
,
natural language processing
,
text analytics
26 October 2019
Annotation Services
Figure Eight
Hypothes.is
Open Annotations
Annotatorjs
OpenAnnotate
Prodi.gy
Doccano
Brat
Tagtog
X-Lisa
LightTag
DataTurks
Supervise.ly
AnnotatedStar
Folia
Annotable
Diigo
Zap
Lionbridge
Gate
UIMA
RSTWeb
LabelIMG
VGG Image Annotator
LabelBox
LabelMe
ImageTagger
RectLabel
Diffgram
Fast Annotation Tool
Further Annotation Tools
Hypothes.is
Open Annotations
Annotatorjs
OpenAnnotate
Prodi.gy
Doccano
Brat
Tagtog
X-Lisa
LightTag
DataTurks
Supervise.ly
AnnotatedStar
Folia
Annotable
Diigo
Zap
Lionbridge
Gate
UIMA
RSTWeb
LabelIMG
VGG Image Annotator
LabelBox
LabelMe
ImageTagger
RectLabel
Diffgram
Fast Annotation Tool
Further Annotation Tools
Labels:
big data
,
data science
,
deep learning
,
machine learning
,
natural language processing
,
text analytics
25 October 2019
24 October 2019
23 October 2019
21 October 2019
Description Logics
description logic
description logic primer
description logics
complexity of reasoning
foundations of description logics
ontologies and the semantic web
description logics
list of reasoners
description logic primer
description logics
complexity of reasoning
foundations of description logics
ontologies and the semantic web
description logics
list of reasoners
Labels:
artificial intelligence
,
big data
,
data science
,
intelligent web
,
linked data
,
semantic web
Alternative Sequences
- Attention - memory added to other networks to guide focus
- Transformers - networks that use attention exclusively instead of recurrent and convolutional layers
- Temporal Convolutional Networks - CNN designed for sequences
Attention Is All You Need
17 October 2019
16 October 2019
ALCOMO
Labels:
big data
,
data science
,
intelligent web
,
linked data
,
natural language processing
,
semantic web
AnyBURL
Labels:
big data
,
data science
,
intelligent web
,
linked data
,
natural language processing
,
semantic web
PatyBred
Labels:
big data
,
data science
,
ecommerce
,
intelligent web
,
linked data
,
natural language processing
,
semantic web
WebisLOD
Labels:
big data
,
data science
,
ecommerce
,
intelligent web
,
linked data
,
natural language processing
,
semantic web
DBkWik
Labels:
big data
,
data science
,
ecommerce
,
intelligent web
,
linked data
,
natural language processing
,
semantic web
T2D
Labels:
big data
,
data science
,
ecommerce
,
intelligent web
,
linked data
,
natural language processing
,
semantic web
Winter
Labels:
big data
,
data science
,
ecommerce
,
intelligent web
,
linked data
,
natural language processing
,
semantic web
15 October 2019
Enthymemes and Argument Mining
Finding Enthymemes in Real-World Texts
Argument Mining Using Argumentation Scheme Structures
Argumentative Approaches to Reasoning with Maximal Consistency
Dave the Debater
Argument Mining Papers (Filtered)
Workshop Synthesis:
EMNLP 2019
EMNLP 2018
EMNLP 2017
EMNLP 2016
EMNLP 2015
EMNLP 2014
Tutorials:
Computational Argumentation
Argument Mining
Unsupervised Corpus Wide Claim Detection
Argument Mining
Argumentation Mining (Synthesis Series)
Argument Mining Using Argumentation Scheme Structures
Argumentative Approaches to Reasoning with Maximal Consistency
Dave the Debater
Argument Mining Papers (Filtered)
Workshop Synthesis:
EMNLP 2019
EMNLP 2018
EMNLP 2017
EMNLP 2016
EMNLP 2015
EMNLP 2014
Tutorials:
Computational Argumentation
Argument Mining
Unsupervised Corpus Wide Claim Detection
Argument Mining
Argumentation Mining (Synthesis Series)
Labels:
big data
,
data science
,
deep learning
,
linked data
,
machine learning
,
natural language processing
,
semantic web
,
text analytics
14 October 2019
13 October 2019
12 October 2019
Prime Spirals
Labels:
artificial intelligence
,
big data
,
data science
,
deep learning
,
machine learning
,
security
BibFrame
Labels:
big data
,
data science
,
deep learning
,
library
,
linked data
,
machine learning
,
natural language processing
,
semantic web
,
text analytics
Netron
Labels:
big data
,
data science
,
deep learning
,
JavaScript
,
machine learning
,
python
,
visualization
11 October 2019
9 October 2019
8 October 2019
6 October 2019
KBert
Labels:
big data
,
data science
,
deep learning
,
linked data
,
machine learning
,
natural language processing
,
semantic web
,
text analytics
5 October 2019
2 October 2019
1 October 2019
Marketing Mix
- Packaging
- Partnership
- Passion
- Penetration
- People
- Perception
- Personality
- Persuasion
- Phrases
- Physical
- Place
- Placement
- Planning
- Popularity
- Population
- Positioning
- Positiveness
- Power
- Pragmatism
- Preference
- Price
- Privacy
- Process
- Product
- Productivity
- Professionalism
- Profit
- Promotion
- Prospect
- Publicity
- Purchase
- Push-Pull
- Picture
- Part
- Pilot
- Persona
- Peers
- Pass-Along-Value
- Party
- Pandemic
- Pandemonium
- Pain
- Placebo
- Planting
- Playfulness
- Pleasure
- Plot
- Politics
- Praise
- Prediction
- Premeditation
- Press
- Pressure
- Preview
- Principle
- Prominence
- Promise
- Proof
- Properties
- Prosperous
- Protection
- Purple Cow
- Purpose
- Production
Medical Codes
ICD-10 - Diagnosis
CPT - Procedures
LOINIC - Laboratory
RxNorms - Medications
ICF - Disabilities
CDT - Dentistry Procedures
DSM-IV-TR - Psychiatric Illnesses
NDC - Drugs
DRG - Diagnosis Group
HCPC - Procedures
Survey of Embeddings Use Cases for Clinical Healthcare
CPT - Procedures
LOINIC - Laboratory
RxNorms - Medications
ICF - Disabilities
CDT - Dentistry Procedures
DSM-IV-TR - Psychiatric Illnesses
NDC - Drugs
DRG - Diagnosis Group
HCPC - Procedures
Survey of Embeddings Use Cases for Clinical Healthcare
Labels:
big data
,
data science
,
deep learning
,
machine learning
,
natural language processing
,
text analytics
30 September 2019
Programming Paradigms
There are various programming paradigms that have come about in computer science. However, none has replicated the abstractions of philosophic logic in their entirety to leverage the full capacity for artificial intelligence. Building programs that replicate philosophic constructs might be a way of leveraging better abstractions for artificial intelligence programs as well as reasoning systems. The following are some constructs derived for abstractions and concreteness that could be baked in or extended as part of the evolution of programming languages to converge the mind and the mechanisms of machines:
- Class
- Object
- Concept
- Thing
- Predicate
- Actor
- Agent
- Subject
- Standard
- Data
- Thought
- Type
- Being
- Action
- Intent
- Event
- Belief
- Desire
- Message
- Axiom
- Restrict
- Rule
- Function
- Relation
- Attribute
- Instance
- Policy
- Critic
- Reward
- Agency
- Context
- Domain
- Range
- Observe
- Process
Labels:
big data
,
computer science
,
data science
,
linked data
,
machine learning
,
natural language processing
,
programming
,
semantic web
,
text analytics
28 September 2019
27 September 2019
24 September 2019
CoreDNS
Labels:
big data
,
data science
,
deep learning
,
internet
,
machine learning
,
natural language processing
,
semantic web
19 September 2019
18 September 2019
Classification (Binary, Multi-Class, Multi-Label)
Binary Classifier - choose single category for an object from two categories
Multi-Class Classifier - choose single category for an object from multiple categories
Multi-Label Classifier - choose as many categories as applicable for the same object
Multi-Class Classifier - choose single category for an object from multiple categories
Multi-Label Classifier - choose as many categories as applicable for the same object
17 September 2019
16 September 2019
13 September 2019
10 September 2019
7 September 2019
2 September 2019
Ballerina
Labels:
big data
,
Cloud
,
data science
,
devops
,
distributed systems
,
programming
,
software engineering
1 September 2019
30 August 2019
28 August 2019
26 August 2019
21 August 2019
19 August 2019
Applications of Data Science
- Anomaly Detection
- Assistive Services
- Auto-Insurance Risk Prediction
- Automated Closed Captioning
- Automated Image Captioning
- Automated Investing
- Autonomous Ships
- Brain Mapping
- Caller Identification
- Cancer Diagnosis/Treatment
- Carbon Emissions Reduction
- Classifying Handwriting
- Computer Vision
- Credit Scoring
- Crime: Predicting Locations
- Crime: Predicting Recidivism
- Crime: Predicting Policing
- Crime: Prevention
- CRISPR Gene Editing
- Crop-Yield Improvement
- Customer Churn
- Customer Experience
- Customer Retention
- Customer Satisfaction
- Customer Service
- Customer Service Agents
- Customized Diets
- Cybersecurity
- Data Mining
- Data Visualization
- Detecting New Viruses
- Diagnosing Breast Cancer
- Diagnosing Heart Disease
- Diagnostic Medicine
- Disaster-Victim Identification
- Drones
- Dynamic Driving Routes
- Dynamic Pricing
- Electronic Health Records
- Emotion Detection
- Energy-Consumption Reduction
- Facial Recognition
- Fitness Tracking
- Fraud Detection
- Game Playing
- Genomics and Healthcare
- Geographic Information Systems
- GPS Systems
- Health Outcome Improvement
- Hospital Readmission Reduction
- Human Genome Sequencing
- Identity-Theft Prevention
- Immunotherapy
- Insurance Pricing
- Intelligent Assistants
- Internet of Things and Medical Device Monitoring
- Internet of Things and Weather Forecasting
- Inventory Control
- Language Translation
- Location-Based Services
- Loyalty Programs
- Malware Detection
- Mapping
- Marketing
- Marketing Analytics
- Music Generation
- Natural Language Translation
- New Pharmaceuticals
- Opioid Abuse Prevention
- Personal Assistants
- Personalized Medicine
- Personalized Shopping
- Phishing Elimination
- Pollution Reduction
- Precision Medicine
- Predicting Cancer Survival
- Predicting Disease Outbreaks
- Predicting Health Outcomes
- Predicting Student Enrollments
- Predicting Weather-Sensitive Product Sales
- Predictive Analytics
- Preventative Medicine
- Preventing Disease Outbreaks
- Reading Sign Language
- Real-Estate Valuation
- Recommendation Systems
- Reducing Overbooking
- Ride Sharing
- Risk Minimization
- Robo Financial Advisors
- Security Enhancements
- Self-Driving Cars
- Sentiment Analysis
- Sharing Economy
- Similarity Detection
- Smart Cities
- Smart Homes
- Smart Meters
- Smart Thermostats
- Smart Traffic Control
- Social Analytics
- Social Graph Analytics
- Spam Detection
- Spatial Data Analysis
- Sports Recruiting and Coaching
- Stock Market Forecasting
- Student Performance Assessment
- Summarizing Text
- Telemedicine
- Terrorist Attack Prevention
- Theft Prevention
- Travel Recommendations
- Trend Spotting
- Visual Product Search
- Voice Recognition
- Voice Search
- Weather Forecasting
Dweet.io
Labels:
big data
,
data science
,
deep learning
,
distributed systems
,
event-driven
,
machine learning
,
webservices
PubNub
Labels:
big data
,
data science
,
deep learning
,
distributed systems
,
event-driven
,
machine learning
,
webservices
18 August 2019
Types of Data Discovery
- CDR
- Emails
- ERP
- Social Media
- Web Logs
- Server Logs
- System Logs
- HTML Pages
- Sales
- Photos
- Videos
- Audios
- Tabulated
- CRM
- Transactions
- XDR
- Sensor Data
- Call Center
- Knowledge Bases
- Google Search
- Google Trends
- News
- Sanctions Data
- Profile Data
Labels:
big data
,
data science
,
deep learning
,
intelligent web
,
machine learning
,
natural language processing
,
semantic web
,
text analytics
,
webcrawler
,
webscraper
17 August 2019
15 August 2019
Wake Word Voice
Yet Another Wake Word Engine
Wake Up Word Speech Recognition
Choosing A Wake Word
Using Wake-Up Word To Filter Out Background Speech
PocketSphinx
Porcupine
Mycroft-Precise
Help Us Improve Precise
Snips.Ai
Federated Learning for Wake Word
Customize Your Voice Assistant With Personal Wake-Word
Alexa Wake-Word Techniques
Visual Wake-Word Dataset
Rhasspy
Snowboy
Revisiting Wake-Word Accuracy and Privacy - Sensory
ExpressIf
How to do Real-Time Trigger Word Detection
On Convolutional LSTM for Joint Wake-Word Detection
Matrix Wake Word Sphinx
GassistPi
Direct Modelling of Raw Audio for Wake-Word Detection
Without Wake-Word
Amazon Alexa
Offline Voice Recognition
Sequence Models for Trigger Words
Arxiv Wake Word
Donut CTC Query-By-Example - Keyword Spotting
How To Easily Command Your App With Hotword Detection
Hotword Cleaner
Challenges To Open Voice Interfaces
DSP Illustrated
Houndify Wake Word
Wake Word Benchmark
Alexa Dataset Wake Word
Detecting Wake Words In Speech
Alexa Wake Words
Custom Alexa Wake Word Generation Dataset
Trigger Word Detection
Choosing A Wake Word
Using Wake-Up Word To Filter Out Background Speech
PocketSphinx
Porcupine
Mycroft-Precise
Help Us Improve Precise
Snips.Ai
Federated Learning for Wake Word
Customize Your Voice Assistant With Personal Wake-Word
Alexa Wake-Word Techniques
Visual Wake-Word Dataset
Rhasspy
Snowboy
Revisiting Wake-Word Accuracy and Privacy - Sensory
ExpressIf
How to do Real-Time Trigger Word Detection
On Convolutional LSTM for Joint Wake-Word Detection
Matrix Wake Word Sphinx
GassistPi
Direct Modelling of Raw Audio for Wake-Word Detection
Without Wake-Word
Amazon Alexa
Offline Voice Recognition
Sequence Models for Trigger Words
Arxiv Wake Word
Donut CTC Query-By-Example - Keyword Spotting
How To Easily Command Your App With Hotword Detection
Hotword Cleaner
Challenges To Open Voice Interfaces
DSP Illustrated
Houndify Wake Word
Wake Word Benchmark
Alexa Dataset Wake Word
Detecting Wake Words In Speech
Alexa Wake Words
Custom Alexa Wake Word Generation Dataset
Trigger Word Detection
8 August 2019
Quantum AI for Psychic Abilities
The 3 T's already are an active research area under teleportation, telepathy, and telekinesis. However, following psychic abilities could also come into the mix in AI:
- Thoughtography - imprinting images in one's mind onto physical surfaces
- Scrying - able to look into mediums to view and detect suitable information
- Second Sight - able to see future and past events or perceive information (precognition)
- Retrocognition - supernaturally perceive past events (postcognition)
- Remote Viewing - able to see distant or unseen target with extrasensory perception
- Pyrokinesis - able to manipulate fire through mind
- Psychometry - able to get information about a person or object by touch
- Psychic Surgery - able to remove disease or disorder within or over the body with energetic incision to heal the body
- Prophecy - able to predict the future
- Precognition - able to perceive future events
- Mediumship - able to communicate with the spirit world
- Levitation - able to float or fly by psychic means
- Energy Medicine - able to heal one's own empathic etheric, astral, mental, or spiritual energy
- Energy Manipulation - able to manipulate non-physical/physical energy with mind
- Dowsing - able to locate water, gravesites, metals, and materials without scientific apparatus
- Divination - able to gain insight into a situation
- Conjuration - able to materialize physical objects from thin air
- Clairvoyance - able to perceive people, objects, locations, or events through extrasensory perception
- Clairsentience - able to perceive messages from emotions and feelings
- Clairolfactance - able to perceive knowledge through smell
- Clairgustance - able to perceive taste without physical contact
- Claircognizance - able to perceive knowledge through intrinsic knowledge
- Clairaudience - able to perceive knowledge through paranormal auditory means
- Chronokinesis - able to alter perception of time causing sense of time to slow down or speed up
- Biokinesis - able to change or control the DNA
- Automatic Writing - able to draw or write without conscious intent
- Aura Reading - able to perceive energy fields around people, places, and objects
- Astral Projection - out-of-body experience or the voluntary projection of consciousness
- Apportation - able to materialize, disappear, or teleport objects
7 August 2019
Drawbacks of Reinforcement Learning
- Reproducibility
- Resource Efficiency
- Susceptibility to Attacks
- Explainability/Accountability
Types of Filtering for Recommendations
- Adaptive
- Contextual (Context Similarity)
- Cognitive (Personality/Behavior)
- Content
- Bayesian
- Relevance Feedback
- Evolutionary Computation
- Deep Learning
- Collaborative (Model vs Memory)
- Matrix Factorization
- Tensor Factorization
- Clustering
- SVD
- Deep Learning
- PCA
- Pearson
- Bayesian
- Markov Decision Processes
- Interest/Intent
- Intent
- Search
- Interest
- Content Consumption
- Impact/Influence
- Social Feedback
- Likes
- Dislikes
- Mentions
- Shares
- Subscribes
- Hashtags
- Emojis
- Reviews
- Comments
- Trends
- Endorsements
- Opinions from Person of Influence
- Associative Connections (Primary/Secondary)
- Six-Degrees of Separation
- Item-based
- User-based
- Personalization
- Reinforcement Learning
- Reward
- Optimization
- Exploration/Exploitation
- Competitive
- Cooperative
- Semantic (with a Knowledge Graph)
- Demographic
Deep Learning Approaches for Recommendations:
- Autoencoders
- Neural Autoregressive Distribution Estimate
- Convolutional Neural Networks
- Recurrent Neural Network
- Long Short Term Memory
- Restricted Boltzmann Machine
- Adversarial Network
- Attentional Model
- Multilayer Perceptron
30 July 2019
28 July 2019
Cloud Providers
Most of Azure cloud service offerings are basically drop-in replacements for their biased standalone software tools. For Microsoft, it seems like Azure is an alternative way of vendor lock-in of the customer via the re-purposed cloud option which has so far proven to be useful through heavy gimmicky marketing. GCP, on the other hand, provides many alternatives for big data but with very ineffective pricing, lacking business critical reliability, security constraints, lots of options to re-invent the wheel with vendor lock-in, still limited in SQL use cases, and their limited over all services. AWS has proven to be a very effective pricing model as well as catering to a wide range of services to cover business needs including a strong reliability and flexible options for management of services. For most organizations, especially for data science work, AWS is the go to cloud solution. Azure and GCP still lag behind considerably in reliability, cloud service offerings, ineffective pricing, and the biggest concern being vendor lock-in. In many cases, cloud providers are limited by their mission statements of what they are trying to achieve through their solutions to businesses and their future progressive infrastructure development goals. For Microsoft, windows is the ultimate success story which one can see has evolved in parallel from Apple. But, linux has become the defacto operating system for the cloud and for obvious reasons. Data as a commodity is a valuable asset to most organizations. And, the management of risk in security and compliance is an enduring struggle for many organizations. Especially, in meeting GDPR compliance many organizations will want a transparent data lineage. Can one trust storage and processing of data on GCP? All Google services converge to some degree or another and get indexed by their search engine. Invariably, the cost and risk of using the third-party cloud infrastructure vs in-house infrastructure will always be a concern for companies to weigh out. It seems, in the long-run, organizations will take back control of their own data storage and processing needs. The peddles of trends are towards portable, smarter, and stackable private cloud ownerships, more flexibility in management of infrastructure, and with virtualization modes at an affordable cost. While start-ups may find it easier with reducing setup costs by leveraging third-party infrastructure. As companies grow with their market value of their products, they may increase their independence by eventually moving away from the third-party cloud dependency to their own in-house converged infrastructure allowing for greater flexibility to meet consumer expectations and the demands of their product services - enterprise enablement drives creative and profitable growth.
Labels:
artificial intelligence
,
big data
,
Cloud
,
data science
,
devops
,
distributed systems
,
microservices
25 July 2019
Deep Learning Datasets
Deep Learning Datasets
Deep Learning Datasets 2
Skymind Datasets
Dataset List
List of Wikipedia Datasets
Tensorflow Datasets
Google Datasearch (depends on how up-to-date are their indexed dataset results)
Deep Learning Datasets 2
Skymind Datasets
Dataset List
List of Wikipedia Datasets
Tensorflow Datasets
Google Datasearch (depends on how up-to-date are their indexed dataset results)
24 July 2019
Everyday Robots
Robots, over the years, have proven themselves as worthy candidates for replacing the manual mundane labor intensive work for humans both for commercial and home. Not only do robots work more effectively, they are also extremely productive. In general, robots can be applied to most specialist labor so they can be trained to be good at a particular aspect of work. But, they may not be sufficiently capable yet to do multiple things through adaptability in multi-class transfer learning. The following highlight some examples of robot use cases.
- automotive breakdown repair man/woman
- home and office cleaner
- rubbish disposal
- grocery shopper
- home and office security officer/inspector
- laundry service
- cook (chef)
- critic / reviewer
- gardener
- table setter
- mechanical turk
- post man/woman
- babysitter
- mystery shopper
- chauffeur
- home and office mover
- handy man/woman
- telephone/broadband installer/repair man/woman
- call centre agent
- lollypop man/woman
- school teacher
- office secretary
- family mediator
- office mediator
- crop duster
- nursing home nurse
- doctor and nurse
- nanny
- lawyer
- accountant
- assembly line worker
- dentist
- data entry clerk
- journalist
- financial analyst
- comedian
- musician
- artist
- telemarketer
- paramedic
- commercial and defence pilots
- public transport worker
- rail repair
- air traffic controller
- land traffic controller
- sea traffic controller
- metrologist
- kitchen porter
- crop pickers
- police man/woman
- fire man/woman
- immigration/border controller
- politician
- director
- photographer
- creative writer
- curator
- cheerleader
- gamer
- construction worker
- programmer
- logging worker
- fisher man/woman
- steel worker
- street sweeper
- refuse collector
- carpenter
- stunt man/woman
- courier
- wrestler
- boxer
- sports man/woman
- recycle waste worker
- power worker
- farmer
- roofer
- astronaut
- army & military officer
- bodyguard
- slaughterhouse worker
- mechanic
- metalcrafter
- search & rescue
- special forces (SAS, Delta Force, Seal, etc)
- sanitation worker
- land mine remover
- miner
- bush pilot
- lumberjack
- librarian
- human resources assistant
- salesman
- editor
- dance instructor
- bus conductor
- tourist guide
- stewardess
- cashier
- store replenisher
- data center operator
- taxi cab driver
- train driver
- lorry driver
- customer service advisor
- electrician
- vehicle washer
- bed maker
- bathroom cleaner
- pet walker
- oilfield driver
- derrick hand
- roustabout
- offshore diver
- rodent killer
- insect killer
- therapist
- architect
- actor
- backup singer
- backup dancer
- house builder
- waiter
- presenter
- manager
- hacker
- stripper (exotic dancer)
- sex worker
- hairdresser
- makeup artist
- fashion designer
- cameraman
- researcher
- chemist
- pharmacist
- landscapist
- baker
- ship builder
- car maker
- broadcast technician
- hotel helpdesk
- store helpdesk
- mall helpdesk
- site assistant
- tailor
- tutor
- pet trainer
- cartoonist
- reporter
- moderator
- painter
- plumber
- auditor
- financial trader
- financial broker
- financial advisor
- compliance advisor
- fraud advisor
- risk advisor
- surveillance agent
- social media agent
- bricklayer
- choreographer
- actuarian
- physiotherapist
- tea/coffee maker
- pizza maker
- burger maker
- welder
- surveyor
- surgeon
- glazier
- tiler
- stonemason
- optician
- tool maker
- artisan
- sonographer
- radio technician
- sports coach
- bartender / barmaid
- bellboy
- paperboy
- drain inspector
- pet feeder
21 July 2019
19 July 2019
13 July 2019
Lucid Pipeline
Most AI solutions can be built as pipelined implementations with various sources to sinks from a set of generalizable models. Invariably, knowledge graph will act as a key layer for evolvable feature engineering that can be translated into ontological vectors and fed into AI models. Split the pipeline as a lucid funnel, lucid reactor, lucid ventshaft, and lucid refinery using a loose analogy of a distillation process. The following components highlight the key abstractions:
AI/DS Engine Layers:
- Disc (frontends - discovery/visualization layer)
- Docs (live specs via swagger, etc - documentation layer)
- APIs (proxy/gateway services connected with elasticsearch or solr - application layer)
- DS (models and semantics - AI layer)
- Eval (benchmarks, workbench and metrics - evaluation layer)
- Human (optional human in the loop - human/annotation layer)
- Tests (load, unit, uat, service, etc - testing layer)
- Admin (control for access management, operations workloads, and automation - administration layer)
- Funnel (ingestion, pre-process, post-process layer using brokers like Kafka/Kinesis)
- Reactor (reactive processes - workflow/transformational layer - via Spark, Beam, Flink, Dask, etc)
- Ventshaft (fuzzy matches, distance matches, probabilistic filters, relational matches, clusters, fake filters, fake matches, feature selection filters, component factors, informed searches, uninformed searches, string matches, projection filters, samplings, tree searches, validations, verifications - functional/utility layer)
- Refinery (context types, objects, attributes and methods as blueprints - entity/object layer)
- Datapiles (indexed data sources as services for document/column/graph stores - data access layer)
- Conf (environment configurations for nginx, etc - configuration layer)
- Cloud (connected services for AWS/GCP orchestration - infrastructure/platform layer)
7 July 2019
5 July 2019
2 July 2019
1 July 2019
30 June 2019
Sweet Beer
Sweet Stouts
Stouts
Lagers
Pale Ales
Belgians
Beers of World
Best Fruit Beers
Dessert Beers
World Beer Awards
International Beer Challenge
Brewing Awards
Beer Ranker
ProductOntology Beer
Stouts
Lagers
Pale Ales
Belgians
Beers of World
Best Fruit Beers
Dessert Beers
World Beer Awards
International Beer Challenge
Brewing Awards
Beer Ranker
ProductOntology Beer
Labels:
beer
,
big data
,
data science
,
food
,
linked data
,
natural language processing
,
semantic web
,
text analytics
28 June 2019
26 June 2019
21 June 2019
20 June 2019
19 June 2019
PermId
Labels:
big data
,
data science
,
finance
,
linked data
,
natural language processing
,
semantic web
,
text analytics
18 June 2019
17 June 2019
16 June 2019
13 June 2019
12 June 2019
11 June 2019
Sports Extended Reality
- Moment is everything (pre-event, event, post-event)
- Nothing is actually live
- Flat images and sense of presence
- 20/80 rule
- Time matters
Attributes of a sports experience:
- Hawkeye cameras
- Replays
- Live scores
- Clock graphics
- Broadcast video + audio (including commentary)
- Live text updates
- Fast cuts
- Music
- High-paced graphics
- Social media interaction (optional)
Live is a belief, an illusion in sports. There are three stages of live sports: Live, Live Live, Live-To-Record or Live-To-Tape. An important aspect arises in the focus of driving a story (narrative) and providing consumers a sense of control.
Labels:
artificial intelligence
,
big data
,
data science
,
machine learning
,
natural language processing
,
sports
,
tv
,
virtual reality
10 June 2019
7 June 2019
GeoPy
Labels:
big data
,
data science
,
event
,
maps
,
natural language processing
,
python
,
text analytics
Scrapely
Labels:
big data
,
data science
,
deep learning
,
intelligent web
,
machine learning
,
python
,
semantic web
,
webcrawler
,
webscraper
Ahmia
Labels:
big data
,
data science
,
intelligent web
,
semantic web
,
webcrawler
,
webscraper
,
webservices
ScrapingHub Splash
Labels:
big data
,
data science
,
JavaScript
,
natural language processing
,
python
,
text analytics
,
webcrawler
,
webscraper
6 June 2019
5 June 2019
27 May 2019
25 May 2019
24 May 2019
23 May 2019
22 May 2019
20 May 2019
GOT Countdown
Labels:
big data
,
data science
,
drama
,
entertainment
,
machine learning
,
predictive analytics
,
sentiment analysis
,
social networks
,
tv
,
tv drama
16 May 2019
Industrial Data Science
Over the years, Data Science as a field has emerged to play a major pivotal role for many industry sectors providing avenues for analytical growth and insights towards more effective products and services. However, there are several glaring aspects of the field that is riddled with misconceptions and ineffective practices. Traditional Data Science was about Data Warehouses, relational way of thinking, Business Intelligence, and overfitted models. However, in the current landscape, Artificial Intelligence, as a discipline, is more about out-of-the-box style of thinking and is having an impact to Data Science practice. Data Engineering and Data Science functions tend to merge as one in AI practice. Relational Algebra is replaced with semantics and context via Knowledge Graphs that form the important metadata layer for a Linked Data Lake. While traditional Data Science relied fully on statistical methods, the new approaches rely on combining Machine Learning and Knowledge Representation and Reasoning approaches in a hybrid model for better Transfer Learning and generalizability. Deep Learning, which is a pure statistical method and a sub-field of Neural Networks, is by its very nature implemented as a set of distributed and probabilistic graphical models. It makes very little sense to split teams between Data Engineering and Data Science as the person building the model also has to think about scalability and performance. Invariably, splitting teams means duplication of work, communication issues, and degradation of output in production (when passed from Data Science to Data Engineering). In many AI domains, there is an inclination towards open box thinking about problems. In AI, only 30% of the effort is Machine Learning while the rest 70% is Computer Science principles and theory. An evidence of this can be seen in the Norvig book which is often the basis of many taught AI 101 courses. Often at universities, in advanced courses, they reluctantly forget to cover the entire Data Science method and only stress on Machine Learning and statistical methods, at exploratory stage, while forgetting the rest of the Computer Science concepts. As a result we see many Data Scientists with Phds that are ill-equipped to tackle practical business cases with productionization of their models against small and large datasets, with appropriate Feature Engineering for semantics, and the associated pipelining. Furthermore, at many institutions Feature Engineering is often skipped entirely which is really 70% of the Data Science method, and possibly the most important stage of the process. Invariably, this Feature Engineering step is partially transferred over as part of the Data Engineering function. One needs to wonder why the Data Scientist is only doing 30% off the work from the Data Science method even after holding a Phd, while passing the reminder of the hard work to the Data Engineer comprising as part of the formal ETL process. The whole point of a Knowledge Graph is really to add the value of semantics and context to your data, and moving towards information and knowledge. This becomes a very important aspect to not only Feature Engineering but a feedback mechanism where one can cyclically improve the model learning while allowing the model to improve on the semantics in a semi-supervised manner. The Knowledge Graph also enables natural language queries, making the data available to the entire organization. No longer the need to hire specialists who understand SQL in order to produce Business Intelligence reports for the business. The whole point is to make data available and accessible for the entire organization while also increasing efficiencies as well as enabling a manageable way of attaining trust through centralized governance and provenance of the data. Thus, enabling data to adapt to the organizational needs and not the organization having to adjust resources for the needs of working with the data. There needs to be a shift in the way many organizations build Data Science teams, how the subject matter is taught at universities, as well as how they architecture for AI transformational solutions. Although, Deep Learning is good at representation learning, it initially requires a large amount of training data. Where large amount of training data is lacking one can rely on semantic Knowledge Graphs, human input, and clustering techniques to get further with Data Science executions which in the long-term will have a far greater benefits to an organization. Many organizations seem to ignore the value of metadata at the start and with the growth of data adds to the complexity and its many challenges for integration. Why must we always push for only statistical methods if many of the direct value can be attained through inference over semantic metadata or a combination of both approaches. By nature for humans probability is unintuitive. When does the average human ever think in statistics when they go about their daily lives from traveling to work to buying groceries at a supermarket to talking to a colleague on the phone - hardly ever. And, yet, an average human is still smarter in many respects, across domains of understanding and adaptability through transfer learning and semantic associations, compared to the most sophisticated Machine Learning algorithm that can be trained to be good at a particular task. However, when the human Data Scientist arrives at work they reduce the scope of the business problem-solution case to a mere statistically derived methods. If the AI is to move forward we must think beyond statistical methods of thinking through complex business cases, flexible semantics, and take more inspiration from the human mind for all the things that we already take for granted in our daily lives that machines still find significantly complex to understand, adapt, and learn.
13 May 2019
10 May 2019
7 May 2019
6 May 2019
Five Rules of Computer Input Popularity
- Cheap
- Reliable
- Comfortable
- Software that makes use of it
- Acceptable user error rate
3 May 2019
2 May 2019
1 May 2019
29 April 2019
Legal Ontologies
Legal Norms = LegalRuleML, NRV
Policies = ODRL, LDR
Licenses = CC, L4LOD
Legal Doc Index = Eurovoc, ELI
Privacy GDPR = GDPRtEXT
Tenders and Procurement = LOTED2, PPROC
OpenLaws
Lynx
Policies = ODRL, LDR
Licenses = CC, L4LOD
Legal Doc Index = Eurovoc, ELI
Privacy GDPR = GDPRtEXT
Tenders and Procurement = LOTED2, PPROC
OpenLaws
Lynx
Labels:
big data
,
data science
,
legal
,
linked data
,
machine learning
,
natural language processing
,
semantic web
,
text analytics
26 April 2019
Blockchain
Labels:
big data
,
Cloud
,
contextual ads
,
data science
,
ecommerce
,
entertainment
,
finance
,
fraud
,
health
,
media
,
microservices
,
mobile
,
news
,
property
,
security
,
semantic web
,
social networks
,
telecommunications
25 April 2019
18 April 2019
8 April 2019
Anago
Labels:
big data
,
data science
,
deep learning
,
machine learning
,
natural language processing
,
text analytics
Flair
Labels:
big data
,
data science
,
deep learning
,
machine learning
,
natural language processing
,
text analytics
7 April 2019
5 April 2019
29 March 2019
AI Ethics
There are a lot of people claiming to be AI ethics experts, but the field is only just emerging. In mainstream, the topic has only been around for a short while. So, how can someone be an expert in it as there is still lots of unanswered research questions in area?
- How can one focus on ethics without also focusing on morals? Morals are the basis of ethics?
- Does ethics in AI, intrinsically, have a universal equivalence? i.e something that is codified as ethical in west may not be sufficiently compatible for the east. What attributes in ethics form for-all-there-exists vs for-some-there-exists as an existential quantification?
- How can one control the abuse and falsely manipulated justification of AI ethics? i.e someone trying to drive political/cultural change/influence in a society/organization using AI ethics?
- How do you make sure that the people in control of ethics, who are by their own accounts calling themselves as ethical experts, are in fact ethical? AI is only as ethical as the human that programmed it? Can the codification of AI ethics be programmed to mutate as defined by the environment and changing norms of society? In so doing, allowing the AI agent to question the ethical and moral dilemmas for/against humans?
- If one builds a moral reasoner in horn clauses, can such reasoning then genetically mutate for ethics, on a case by case basis, for conditioning of an AI agent? Can AI agents be influenced by other AI agents, like in a multiagent distributed system - argumentation via game theory, reinforcement policies, towards mediation and consensus?
- Can ethics and morals be defined in a semantically equivalent language?
- If one defines horn clauses for moral reasoning and a set of ethical rules, can such moral/ethical conundrums be further defined using markov decision processes, in form of neural network, for any and all states as a good enough coverage for a global search space that can be further reasoned over with transfer learning?
- How do you resolve human bias in a so called AI ethics expert?
- Who defines what is ethical and moral for AI? Is there an agreed gold standard of measure?
In general, a moral person wants to do the right thing with a moral impulse that drives the best intentions. Morals define our principles. While ethics tend to be more practical towards a set of codified rules that define our actions and behaviors. Although, the two concepts are similar, they are not interchangeable nor aligned in every case. Ethics are not always moral. While a moral action can also be unethical.
AI Ethics Lab
AI Ethics Lab
19 March 2019
17 March 2019
12 February 2019
26 January 2019
22 January 2019
Huginn
Labels:
artificial intelligence
,
big data
,
data science
,
intelligent web
,
multiagents
,
webcrawler
19 January 2019
17 January 2019
NLP Games with Purpose
Games with a purpose are essentially types of games applied to annotations in NLP to make the process fun for the oracle (annotator), often in a crowdsourced manner. A few examples in context are listed below:
- Phrase Detective
- Sentiment Quiz
- Guess What
- ESP Game
Subscribe to:
Posts
(
Atom
)